Arch Linux

Please read this before reporting a bug:
https://wiki.archlinux.org/title/Bug_reporting_guidelines

Do NOT report bugs when a package is just outdated, or it is in the AUR. Use the 'flag out of date' link on the package page, or the Mailing List.

REPEAT: Do NOT report bugs for outdated packages!
Tasklist

FS#77234 - T14 Gen2i laptop occasional blank screen after sleep, and "BUG: kernel NULL pointer dereference"

Attached to Project: Arch Linux
Opened by Attila Vangel (attila123) - Sunday, 22 January 2023, 19:32 GMT
Last edited by Toolybird (Toolybird) - Sunday, 22 January 2023, 20:57 GMT
Task Type Bug Report
Category Kernel
Status Waiting on Response
Assigned To No-one
Architecture x86_64
Severity High
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 0%
Votes 0
Private No

Details

Description:
On my new work laptop, a ThinkPad T14 Gen2 Intel (Intel 11th gen) occasionally the screen does not come back from sleep (just blank/black).
The last time it happened today, with the 6.1.7-arch1-1 kernel. I just upgraded my system yesterday.
Also, I see "BUG: kernel NULL pointer dereference <...>" kernel messages with this new laptop (did not have any of that with my old ThinkPad T470 laptop).
I used also the lts kernel, it also happened with that (with at least 5.15.88-2-lts).

I don't use this laptop for a long time, so these messages start with Jan 10:

$ journalctl -o short-precise -k -b all | grep NULL
Jan 10 13:48:16.545471 t470 kernel: BUG: kernel NULL pointer dereference, address: 000000000000010e
Jan 10 13:48:16.560922 t470 kernel: BUG: kernel NULL pointer dereference, address: 0000000000000a69
Jan 12 09:32:00.189451 t470 kernel: BUG: kernel NULL pointer dereference, address: 000000000000010e
Jan 12 09:32:00.197823 t470 kernel: BUG: kernel NULL pointer dereference, address: 0000000000000a69
Jan 16 09:13:21.536894 t470 kernel: BUG: kernel NULL pointer dereference, address: 000000000000010e
Jan 16 09:13:21.545711 t470 kernel: BUG: kernel NULL pointer dereference, address: 0000000000000a69
Jan 16 18:37:22.576113 t470 kernel: BUG: kernel NULL pointer dereference, address: 000000000000010e
Jan 16 18:37:22.585825 t470 kernel: BUG: kernel NULL pointer dereference, address: 0000000000000a69
Jan 18 19:19:05.939491 t470 kernel: BUG: kernel NULL pointer dereference, address: 000000000000010e
Jan 18 21:04:45.603106 t470 kernel: BUG: kernel NULL pointer dereference, address: 000000000000010e
Jan 18 21:04:45.620137 t470 kernel: BUG: kernel NULL pointer dereference, address: 0000000000000a69
Jan 21 15:31:01.678447 t470 kernel: BUG: kernel NULL pointer dereference, address: 000000000000010e
Jan 21 15:31:01.681808 t470 kernel: BUG: kernel NULL pointer dereference, address: 0000000000000a69

I am not sure if that is related, but another user reported them also with the same sleep comeback problem at: https://bbs.archlinux.org/viewtopic.php?pid=2080801

Attaching the full output of `journalctl -o short-precise -k -b -1`.

Steps to reproduce:
Last time it happened to me with after opening the laptop lid to make come back from sleep. Screen blank. It was on a charger if that matters (not sure).
This task depends upon

Comment by Toolybird (Toolybird) - Sunday, 22 January 2023, 20:57 GMT
Not much Arch can do here. You could try reporting upstream. But it's probably worth trying a mainline kernel first to see if the problem is reproducible. Please see the pinned comment here [1] for a precompiled mainline kernel you can install. Otherwise, the general kernel troubleshooting steps are documented here [2]. Please let us know what you find out.

[1] https://aur.archlinux.org/packages/linux-mainline
[2] https://wiki.archlinux.org/title/Kernel#Troubleshooting
Comment by Attila Vangel (attila123) - Saturday, 28 January 2023, 00:22 GMT
Hi, thanks for the answer and sorry for getting back late.
I managed to install the kernel from AUR (although it took some time), but then even after reboot Virtualbox did not work with it out of box and I had no more "mental energy" left at the moment trying to fix it.

Instead, I turned to use Fedora 37 all week to see if it also had this issue (meaning generic kernel issue?), or was it Arch only (I installed it earlier on another partition as "plan B", in case of some issues with Arch). I continued to use my laptop in the same way.
I had zero, 0, zilch such "NULL pointer dereference" kernel issues (or lockups after standby) with the exact same laptop under Fedora 37 (with kernel 6.1.6-200.fc37.x86_64, 6.1.7 came out a bit later for Fedora, and I did not want to reboot my laptop in the whole week to test stability).
Details again:

$ hostnamectl status | tail -6
CPE OS Name: cpe:/o:fedoraproject:fedora:37
Kernel: Linux 6.1.6-200.fc37.x86_64
Architecture: x86-64
Hardware Vendor: Lenovo
Hardware Model: ThinkPad T14 Gen 2i
Firmware Version: N34ET53W (1.53 )

At that time I checked again for BIOS upgrades, did not find any. Anyway, my laptop is stable now (on Fedora 37) since:

$ uptime
18:20:09 up 4 days, 18:30, 1 user, load average: 0.42, 0.46, 0.54

So it seems to be an Arch Linux only issue with this laptop.
I also quickly checked the Arch kernel troubleshooting link, did not help me.
Comment by Attila Vangel (attila123) - Saturday, 28 January 2023, 01:04 GMT
I had an idea to take lsmod list from both distros, so took it. Also then sorted them and took the first "column" to be more diff-able. There are several differences between the module list of the two distros.
Diff follows.

$ diff lsmod_arch_sorted_modules_only.txt lsmod_fedora_sorted_modules_only.txt
5,10d4
< aesni_intel
< af_alg
< algif_hash
< algif_skcipher
< atkbd
< blake2b_generic
13d6
< bpf_preload
19d11
< btrfs
22c14
< ccm
---
> cdc_ether
25d16
< cmac
28,29d18
< crc16
< crc32c_generic
33,36d21
< cryptd
< crypto_simd
< crypto_user
< dm_mod
41,42d25
< ecdh_generic
< ext4
46d28
< gf128mul
47a30
> hid_multitouch
52d34
< i8042
60d41
< intel_gtt
62,63d42
< intel_lpss
< intel_lpss_pci
71,75c50
< ip6table_filter
< ip6_tables
< iptable_filter
< iptable_nat
< ip_tables
---
> ip_set
81d55
< jbd2
87,88d60
< libcrc32c
< libps2
91,92d62
< mac_hid
< mbcache
98a69
> mii
101,102d71
< mousedev
< mtd
108a78,89
> nf_reject_ipv4
> nf_reject_ipv6
> nf_tables
> nft_chain_nat
> nft_compat
> nft_ct
> nft_fib
> nft_fib_inet
> nft_fib_ipv4
> nft_fib_ipv6
> nft_reject
> nft_reject_inet
113a95
> pinctrl_tigerlake
122,123c104,106
< psmouse
< raid6_pq
---
> qrtr
> r8152
> r8153_ecm
127c110,112
< roles
---
> scsi_dh_alua
> scsi_dh_emc
> scsi_dh_rdac
130d114
< serio
132d115
< sg
143a127
> snd_hrtimer
148a133,135
> snd_seq
> snd_seq_device
> snd_seq_dummy
172,174d158
< spi_intel
< spi_intel_pci
< spi_nor
175a160
> sunrpc
178a164
> tls
181a168
> typec_displayport
182a170
> uas
184c172,173
< usbhid
---
> usbnet
> usb_storage
197d185
< vivaldi_fmap
201,204c189
< xhci_pci
< xhci_pci_renesas
< xor
< x_tables
---
> xfs
209c194,195
< xt_tcpudp
---
> xt_REDIRECT
> zram
Comment by Attila Vangel (attila123) - Saturday, 28 January 2023, 01:07 GMT
Also, just updated my Arch and it got linux-6.1.8.arch1-1-x86_64 (will test the laptop for now with this kernel until lockup, if any), and linux-firmware-20230117.7e4f0ed-1-any (if that matters for this issue).
If the issue persists, I will try the mainline kernel.

Loading...