FS#62345 - help in debugging system crash

Attached to Project: Arch Linux
Opened by Carlo Carloni Calame (cmcc) - Sunday, 14 April 2019, 16:30 GMT
Last edited by David Runge (dvzrv) - Friday, 17 January 2020, 13:12 GMT
Task Type Bug Report
Category Kernel
Status Closed
Assigned To No-one
Architecture x86_64
Severity High
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

I own since a few months a lenovo T480s.

I succesfully installed archlinux, which I'm using on my laptops since many years and which I keep regularly updated.

On the T480s I experience random system hangs, both with linux and linux-lts: at the beginning I thought it was something triggered by a sound/audio problem, since it was happening while watching videos on youtube, playing radio streams on chromium, speaking on skypeforlinux or using VidyoDesktop. But yesterday it happened while just writing email on gmail website.

The system just hangs, becomes totally unresponsive and must be hardly restared. The caps lock led blinks for a while after the crash. If a sound was playing, the last 1-2 seconds keep looping forever. No trace of the crash is left on the journald after the reboot.

I run cpu-intensive programs for my job, but the hangs usually happen when the laptop is not at all intensively used.

Trying to catch any signal of the crash, I sshed to the T480s from another machine issuing 'sudo journalctl -f', 'sudo dmesg -w' and 'sudo cat /proc/kmesg', but unfortunately nothing suspicious is present in the logs at the moment of the crash.

I'm just wondering how I can debug such hard-to-trace crashes, before blaiming any hardware and send the laptop for repair. Windows is not installed anymore since I wiped it out.

Just for information, I use TLP for power saving, powertop as suggested in the archwiki page and I installed 'throttled' from https://aur.archlinux.org/packages/throttled/ to workaround Intel throttling issues.

Any help would be really appreciated!

Here follows the output of inxi -F:

System: Host: ********** Kernel: 5.0.7-arch1-1-ARCH x86_64 bits: 64 Console: tty 1 Distro: Arch Linux
Machine: Type: Laptop System: LENOVO product: ****** v: ThinkPad T480s serial: ******
Mobo: LENOVO model: ********* serial: *********** UEFI: LENOVO v: N22ET53W (1.30 ) date: 02/19/2019
Battery: ID-1: BAT0 charge: 27.6 Wh condition: 54.0/57.0 Wh (95%)
CPU: Topology: Quad Core model: Intel Core i7-8650U bits: 64 type: MT MCP L2 cache: 8192 KiB
Speed: 842 MHz min/max: 400/4200 MHz Core speeds (MHz): 1: 854 2: 851 3: 863 4: 810 5: 850 6: 851 7: 817 8: 849
Graphics: Device-1: Intel UHD Graphics 620 driver: i915 v: kernel
Display: server: X.Org 1.20.4 driver: i915 resolution: 1920x1080~60Hz
OpenGL: renderer: Mesa DRI Intel UHD Graphics 620 (Kabylake GT2) v: 4.5 Mesa 19.0.2
Audio: Device-1: Intel Sunrise Point-LP HD Audio driver: snd_hda_intel
Sound Server: ALSA v: k5.0.7-arch1-1-ARCH
Network: Device-1: Intel Ethernet I219-LM driver: e1000e
IF: enp0s31f6 state: down mac: ***********
Device-2: Intel Wireless 8265 / 8275 driver: iwlwifi
IF: wlp61s0 state: up mac: ***********
Drives: Local Storage: total: 953.87 GiB used: 171.37 GiB (18.0%)
ID-1: /dev/nvme0n1 vendor: Samsung model: MZVLB1T0HALR-000L7 size: 953.87 GiB
Partition: ID-1: / size: 97.93 GiB used: 25.77 GiB (26.3%) fs: ext4 dev: /dev/nvme0n1p6
ID-2: /boot size: 256.0 MiB used: 129.9 MiB (50.8%) fs: vfat dev: /dev/nvme0n1p1
ID-3: /home size: 823.45 GiB used: 145.48 GiB (17.7%) fs: ext4 dev: /dev/nvme0n1p7
ID-4: swap-1 size: 16.00 GiB used: 0 KiB (0.0%) fs: swap dev: /dev/nvme0n1p5
Sensors: System Temperatures: cpu: 34.0 C mobo: N/A
Fan Speeds (RPM): cpu: 0
Info: Processes: 288 Uptime: 57m Memory: 15.42 GiB used: 3.29 GiB (21.4%) Shell: bash inxi: 3.0.33
This task depends upon

Closed by  David Runge (dvzrv)
Friday, 17 January 2020, 13:12 GMT
Reason for closing:  Won't fix
Additional comments about closing:  Not directly related to anything in the repos as of today
Comment by jb (jb.1234abcd) - Monday, 15 April 2019, 06:50 GMT
Less is more, sometimes.
You run three packages that overlap in their functionalities (possibly confusing each other): tlp, powertop, throttled.
Deinstall all of them.

Update machine to the latest state (I suggest you run latest linux kernel).
Judging by your activities (online work: youtube, e-mil, etc) the source of trouble could be your wired or wireless net devices/device drivers.
Run it for a while, at least a week, to see if any problems occur - check dmesg after each reboot and journalctl daily anyway.

If a problem occurs, then search Internet for bug reports and proposed fixes - it could be the kernel.
If not, you are set - no need for additional packages like the above, at least for long time.

If a problem still occurs, try to install throttled package (it claims to contain fixes for your T480) - run it for a week to see the effect.
Understand what it does (it is experimental, it changes Lenovo system data) - do not expect too much, use its entries to monitor and/or debug the system, report it back to package maintainer:
https://github.com/erpalma/throttled
If it does not fix anything, uninstall it.

And so on, but do not install these packages at the same time.

https://wiki.archlinux.org/index.php/Lenovo_ThinkPad_T480s

Google search: lenovo t480 freeze
Users report lots of freezes.

You should try to get Windows original or copy from Lenovo or reseller - it is a relatively new thinkpad (since 2018) and in need of BIOS, drivers, etc updates.


Comment by Carlo Carloni Calame (cmcc) - Monday, 15 April 2019, 11:11 GMT
Thanks jb for your comments. I'll start by disabling TLP and powertop, and proceed step by step as you suggest.

Loading...