Arch Linux

Please read this before reporting a bug:
https://wiki.archlinux.org/index.php/Reporting_Bug_Guidelines

Do NOT report bugs when a package is just outdated, or it is in the AUR. Use the 'flag out of date' link on the package page, or the Mailing List.

REPEAT: Do NOT report bugs for outdated packages!
Tasklist

FS#72905 - Running flatpak apps cause kernel: list_add corruption and list_del corruption errors

Attached to Project: Arch Linux
Opened by Dimitris Levogiannis (dimlev) - Thursday, 02 December 2021, 19:26 GMT
Last edited by Andreas Radke (AndyRTR) - Thursday, 02 December 2021, 20:01 GMT
Task Type Bug Report
Category Kernel
Status Assigned
Assigned To Jan Alexander Steffens (heftig)
Architecture x86_64
Severity High
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 0%
Votes 1
Private No

Details

Description:
Running flatpak apps cause kernel: list_add corruption and list_del corruption errors to appear in journal. There were a few times that my entire system crashed.
I don't know when these errors started so reverting to the lts kernel (5.10.83-1) did not help.

Additional info:
Machine 1:
Hardware: Intel NUC8i5BEH (Coffee Lake i5-8259U, 32 GB non ECC ram, Intel iris 655)
Kernels: 5.15.5.arch1-1, 5.10.83-1 (linux-lts)
Flatpak: 1.12.2

Machine 2:
Hardware: Intel NUC6i7KYK (Skylake i7-6770HQ, 16 GB non ECC ram, Intel iris pro 580)
Kernel: 5.15.5-arch1-1
Flatpak: 1.12.2, 1.11.3

Machine 3:
Hardware: Intel(R) Xeon(R) CPU E3-1245 V2 @ 3.40GHz, 32 GB ECC ram
Kernel: 5.15.5-arch1-1
Flatpak: 1.12.2

I can reproduce the issue in the 3 physical machines described above but when i also tried the same running in a libvirtd/qemu vm:
- arch did not generate any errors (5.15.5-arch1-1, flatpak 1.12.2)
- fedora 35 did not generate any errors (Linux fedora 5.14.18-300.fc35.x86_64, flatpak 1.12.2)
- ubuntu 20.04 did not generate any errors (5.11.0-41-generic, flatpak 1.12.2)

* log files:
I get thousands of the following:

kernel: ------------[ cut here ]------------
kernel: list_add corruption. next->prev should be prev (ffff9a7a03a90820), but was ffffefc9d2308c48. (next=ffffefc9d219e688).
kernel: WARNING: CPU: 1 PID: 135 at lib/list_debug.c:23 __list_add_valid+0x41/0xb0
kernel: Modules linked in: vhost_net vhost vhost_iotlb tap tun loop snd_seq_dummy snd_hrtimer snd_seq xt_LOG nf_log_syslog xt_recent nf_conntrack_netlink br_netfilter overlay xt_CHECKSUM xt_MASQUERADE nft_chain_na>
kernel: irqbypass snd_hda_codec libarc4 uvcvideo btusb rapl snd_usbmidi_lib intel_cstate snd_hda_core btrtl snd_rawmidi i2c_i801 ipt_REJECT videobuf2_vmalloc btbcm vfat intel_uncore fat pcspkr iwlwifi e1000e i2c_>
kernel: crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd rtsx_pci xhci_pci_renesas
kernel: CPU: 1 PID: 135 Comm: kswapd0 Tainted: G W 5.15.5-arch1-1 #1 f0168f793e3f707b46715a62fafabd6a40826924
kernel: Hardware name: Intel(R) Client Systems NUC8i5BEH/NUC8BEB, BIOS BECFL357.86A.0089.2021.0621.1343 06/21/2021
kernel: RIP: 0010:__list_add_valid+0x41/0xb0
kernel: Code: 74 63 49 39 f9 74 5e b8 01 00 00 00 31 d2 89 d1 89 d6 89 d7 41 89 d0 41 89 d1 c3 4c 89 c1 48 c7 c7 60 66 eb ba e8 63 be 62 00 <0f> 0b 31 c0 31 d2 89 d1 89 d6 89 d7 41 89 d0 41 89 d1 c3 48 89 d1
kernel: RSP: 0018:ffffb39c808d3ab8 EFLAGS: 00010046
kernel: RAX: 0000000000000000 RBX: 0000000000000006 RCX: 0000000000000000
kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
kernel: RBP: 0000000000000006 R08: 0000000000000000 R09: 0000000000000000
kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffffefc9cc789d88
kernel: R13: ffffefc9d219e688 R14: ffffefc9cc789d80 R15: ffff9a7a03a90820
kernel: FS: 0000000000000000(0000) GS:ffff9a8160e40000(0000) knlGS:0000000000000000
kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kernel: CR2: 00007f49c42b76e0 CR3: 0000000223c10003 CR4: 00000000003726e0
kernel: Call Trace:
kernel: <TASK>
kernel: isolate_lru_pages+0x3bf/0x480
kernel: ? shrink_page_list+0x78a/0xee0
kernel: shrink_lruvec+0x613/0xcf0
kernel: ? shrink_node+0x29d/0x700
kernel: shrink_node+0x29d/0x700
kernel: balance_pgdat+0x336/0x6f0
kernel: kswapd+0x1fd/0x3b0
kernel: ? do_wait_intr_irq+0xb0/0xb0
kernel: ? balance_pgdat+0x6f0/0x6f0
kernel: kthread+0x132/0x160
kernel: ? set_kthread_struct+0x50/0x50
kernel: ret_from_fork+0x22/0x30
kernel: </TASK>
kernel: ---[ end trace 2a731310913c0b49 ]---
kernel: ------------[ cut here ]------------
kernel: list_del corruption. prev->next should be ffffefc9cc789d88, but was ffffb39c808d3c00
kernel: WARNING: CPU: 1 PID: 135 at lib/list_debug.c:51 __list_del_entry_valid+0x94/0xc0
kernel: Modules linked in: vhost_net vhost vhost_iotlb tap tun loop snd_seq_dummy snd_hrtimer snd_seq xt_LOG nf_log_syslog xt_recent nf_conntrack_netlink br_netfilter overlay xt_CHECKSUM xt_MASQUERADE nft_chain_na>
kernel: irqbypass snd_hda_codec libarc4 uvcvideo btusb rapl snd_usbmidi_lib intel_cstate snd_hda_core btrtl snd_rawmidi i2c_i801 ipt_REJECT videobuf2_vmalloc btbcm vfat intel_uncore fat pcspkr iwlwifi e1000e i2c_>
kernel: crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd rtsx_pci xhci_pci_renesas
kernel: CPU: 1 PID: 135 Comm: kswapd0 Tainted: G W 5.15.5-arch1-1 #1 f0168f793e3f707b46715a62fafabd6a40826924
kernel: Hardware name: Intel(R) Client Systems NUC8i5BEH/NUC8BEB, BIOS BECFL357.86A.0089.2021.0621.1343 06/21/2021
kernel: RIP: 0010:__list_del_entry_valid+0x94/0xc0
kernel: Code: c7 70 67 eb ba e8 80 bd 62 00 0f 0b 31 c0 31 d2 89 d6 89 d7 41 89 d0 c3 48 89 f2 48 89 fe 48 c7 c7 a8 67 eb ba e8 60 bd 62 00 <0f> 0b 31 c0 31 d2 89 d6 89 d7 41 89 d0 c3 48 c7 c7 e8 67 eb ba e8
kernel: RSP: 0018:ffffb39c808d3ab8 EFLAGS: 00010046
kernel: RAX: 0000000000000000 RBX: 0000000000000007 RCX: 0000000000000000
kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
kernel: RBP: 0000000000000007 R08: 0000000000000000 R09: 0000000000000000
kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffffefc9cc789d88
kernel: R13: 0000000000000001 R14: ffffefc9cc789d80 R15: ffff9a7a03a90820
kernel: FS: 0000000000000000(0000) GS:ffff9a8160e40000(0000) knlGS:0000000000000000
kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kernel: CR2: 00007f49c42b76e0 CR3: 0000000223c10003 CR4: 00000000003726e0
kernel: Call Trace:
kernel: <TASK>
kernel: isolate_lru_pages+0x39a/0x480
kernel: ? shrink_page_list+0x78a/0xee0
kernel: shrink_lruvec+0x613/0xcf0
kernel: ? shrink_node+0x29d/0x700
kernel: shrink_node+0x29d/0x700
kernel: balance_pgdat+0x336/0x6f0
kernel: kswapd+0x1fd/0x3b0
kernel: ? do_wait_intr_irq+0xb0/0xb0
kernel: ? balance_pgdat+0x6f0/0x6f0
kernel: kthread+0x132/0x160
kernel: ? set_kthread_struct+0x50/0x50
kernel: ret_from_fork+0x22/0x30
kernel: </TASK>
kernel: ---[ end trace 2a731310913c0b4a ]---


Steps to reproduce:
Install flatpak
Install https://flathub.org/apps/details/org.gnome.gitlab.YaLTeR.VideoTrimmer
Open video trimmer and select a video to edit.
See kernel errors on logs immediately

Running the video trimmer app without flatpak does not produce errors.

This task depends upon

Comment by Dimitris Levogiannis (dimlev) - Thursday, 02 December 2021, 19:38 GMT
Correction: The arch running in the vm generated kernel errors during system shutdown.
Comment by Minmo (Minmo) - Thursday, 02 December 2021, 21:44 GMT
Edit: Maybe related to this:

https://www.spinics.net/lists/netdev/msg783650.html

I'm encountering the same issue on the following system:
Arch Linux x86_64
5.10.82-1-lts

CPU: AMD Ryzen 9 3900X (24) @ 3.800GHz
GPU: NVIDIA GeForce GTX 1080 Ti
Memory: 64305MiB

Flatpak version: Flatpak 1.12.2

logs:

```
Dec 02 22:05:19 cellaris kernel: ------------[ cut here ]------------
Dec 02 22:05:19 cellaris kernel: list_add corruption. next->prev should be prev (ffff950bc841f420), but was fffff9011462e288. (next=fffff901359acc08).
Dec 02 22:05:19 cellaris kernel: WARNING: CPU: 10 PID: 202 at lib/list_debug.c:23 __list_add_valid+0x33/0x70
Dec 02 22:05:19 cellaris kernel: Modules linked in: snd_seq_dummy snd_hrtimer snd_seq uinput nvidia_drm(POE) nvidia_modeset(POE) nvidia_uvm(POE) nvidia(POE) wmi_bmof snd_hda_codec_realtek xfs snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi snd_hda_>
Dec 02 22:05:19 cellaris kernel: crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd glue_helper ccp rng_core xhci_pci xhci_pci_renesas
Dec 02 22:05:19 cellaris kernel: CPU: 10 PID: 202 Comm: kswapd0 Tainted: P W OE 5.10.82-1-lts #1
Dec 02 22:05:19 cellaris kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X570 Pro4, BIOS P1.40 08/12/2019
Dec 02 22:05:19 cellaris kernel: RIP: 0010:__list_add_valid+0x33/0x70
Dec 02 22:05:19 cellaris kernel: Code: f2 75 18 4c 8b 0a 4d 39 c1 75 24 48 39 fa 74 39 49 39 f9 74 34 b8 01 00 00 00 c3 4c 89 c1 48 c7 c7 40 b0 59 b2 e8 30 5e 55 00 <0f> 0b 31 c0 c3 48 89 d1 4c 89 c6 4c 89 ca 48 c7 c7 90 b0 59 b2 e8
Dec 02 22:05:19 cellaris kernel: RSP: 0018:ffffb72d4077fad8 EFLAGS: 00010086
Dec 02 22:05:19 cellaris kernel: RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000027
Dec 02 22:05:19 cellaris kernel: RDX: ffff951a5eca0bb8 RSI: 0000000000000001 RDI: ffff951a5eca0bb0
Dec 02 22:05:19 cellaris kernel: RBP: fffff90107daaac0 R08: 0000000000000000 R09: ffffb72d4077f8f8
Dec 02 22:05:19 cellaris kernel: R10: ffffb72d4077f8f0 R11: ffff951a9f28e180 R12: 0000000000000002
Dec 02 22:05:19 cellaris kernel: R13: ffff950bc841f400 R14: fffff901359acc08 R15: fffff90107daaac8
Dec 02 22:05:19 cellaris kernel: FS: 0000000000000000(0000) GS:ffff951a5ec80000(0000) knlGS:0000000000000000
Dec 02 22:05:19 cellaris kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Dec 02 22:05:19 cellaris kernel: CR2: 00007f80bd5cdda0 CR3: 000000010a670000 CR4: 0000000000350ee0
Dec 02 22:05:19 cellaris kernel: Call Trace:
Dec 02 22:05:19 cellaris kernel: move_pages_to_lru.isra.0+0x284/0x510
Dec 02 22:05:19 cellaris kernel: shrink_inactive_list+0x1ae/0x410
Dec 02 22:05:19 cellaris kernel: shrink_lruvec+0x466/0x720
Dec 02 22:05:19 cellaris kernel: ? do_shrink_slab+0x4d/0x240
Dec 02 22:05:19 cellaris kernel: ? vmpressure+0x56/0x110
Dec 02 22:05:19 cellaris kernel: shrink_node+0x2b1/0x6d0
Dec 02 22:05:19 cellaris kernel: balance_pgdat+0x2fc/0x610
Dec 02 22:05:19 cellaris kernel: kswapd+0x1fb/0x380
Dec 02 22:05:19 cellaris kernel: ? add_wait_queue_exclusive+0x70/0x70
Dec 02 22:05:19 cellaris kernel: ? balance_pgdat+0x610/0x610
Dec 02 22:05:19 cellaris kernel: kthread+0x11b/0x140
Dec 02 22:05:19 cellaris kernel: ? kthread_associate_blkcg+0xa0/0xa0
Dec 02 22:05:19 cellaris kernel: ret_from_fork+0x22/0x30
Dec 02 22:05:19 cellaris kernel: ---[ end trace cc26270002e42f06 ]---
```
Comment by Dimitris Levogiannis (dimlev) - Wednesday, 08 December 2021, 08:56 GMT
Kernel 5.15.6-arch2-1 seems to fix the issue for me.

Loading...