FS#72905 - Running flatpak apps cause kernel: list_add corruption and list_del corruption errors

Attached to Project: Arch Linux
Opened by Dimitris Levogiannis (dimlev) - Thursday, 02 December 2021, 19:26 GMT
Last edited by Toolybird (Toolybird) - Tuesday, 06 June 2023, 04:17 GMT
Task Type Bug Report
Category Kernel
Status Closed
Assigned To Jan Alexander Steffens (heftig)
Architecture x86_64
Severity High
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 1
Private No

Details

Description:
Running flatpak apps cause kernel: list_add corruption and list_del corruption errors to appear in journal. There were a few times that my entire system crashed.
I don't know when these errors started so reverting to the lts kernel (5.10.83-1) did not help.

Additional info:
Machine 1:
Hardware: Intel NUC8i5BEH (Coffee Lake i5-8259U, 32 GB non ECC ram, Intel iris 655)
Kernels: 5.15.5.arch1-1, 5.10.83-1 (linux-lts)
Flatpak: 1.12.2

Machine 2:
Hardware: Intel NUC6i7KYK (Skylake i7-6770HQ, 16 GB non ECC ram, Intel iris pro 580)
Kernel: 5.15.5-arch1-1
Flatpak: 1.12.2, 1.11.3

Machine 3:
Hardware: Intel(R) Xeon(R) CPU E3-1245 V2 @ 3.40GHz, 32 GB ECC ram
Kernel: 5.15.5-arch1-1
Flatpak: 1.12.2

I can reproduce the issue in the 3 physical machines described above but when i also tried the same running in a libvirtd/qemu vm:
- arch did not generate any errors (5.15.5-arch1-1, flatpak 1.12.2)
- fedora 35 did not generate any errors (Linux fedora 5.14.18-300.fc35.x86_64, flatpak 1.12.2)
- ubuntu 20.04 did not generate any errors (5.11.0-41-generic, flatpak 1.12.2)

* log files:
I get thousands of the following:

kernel: ------------[ cut here ]------------
kernel: list_add corruption. next->prev should be prev (ffff9a7a03a90820), but was ffffefc9d2308c48. (next=ffffefc9d219e688).
kernel: WARNING: CPU: 1 PID: 135 at lib/list_debug.c:23 __list_add_valid+0x41/0xb0
kernel: Modules linked in: vhost_net vhost vhost_iotlb tap tun loop snd_seq_dummy snd_hrtimer snd_seq xt_LOG nf_log_syslog xt_recent nf_conntrack_netlink br_netfilter overlay xt_CHECKSUM xt_MASQUERADE nft_chain_na>
kernel: irqbypass snd_hda_codec libarc4 uvcvideo btusb rapl snd_usbmidi_lib intel_cstate snd_hda_core btrtl snd_rawmidi i2c_i801 ipt_REJECT videobuf2_vmalloc btbcm vfat intel_uncore fat pcspkr iwlwifi e1000e i2c_>
kernel: crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd rtsx_pci xhci_pci_renesas
kernel: CPU: 1 PID: 135 Comm: kswapd0 Tainted: G W 5.15.5-arch1-1 #1 f0168f793e3f707b46715a62fafabd6a40826924
kernel: Hardware name: Intel(R) Client Systems NUC8i5BEH/NUC8BEB, BIOS BECFL357.86A.0089.2021.0621.1343 06/21/2021
kernel: RIP: 0010:__list_add_valid+0x41/0xb0
kernel: Code: 74 63 49 39 f9 74 5e b8 01 00 00 00 31 d2 89 d1 89 d6 89 d7 41 89 d0 41 89 d1 c3 4c 89 c1 48 c7 c7 60 66 eb ba e8 63 be 62 00 <0f> 0b 31 c0 31 d2 89 d1 89 d6 89 d7 41 89 d0 41 89 d1 c3 48 89 d1
kernel: RSP: 0018:ffffb39c808d3ab8 EFLAGS: 00010046
kernel: RAX: 0000000000000000 RBX: 0000000000000006 RCX: 0000000000000000
kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
kernel: RBP: 0000000000000006 R08: 0000000000000000 R09: 0000000000000000
kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffffefc9cc789d88
kernel: R13: ffffefc9d219e688 R14: ffffefc9cc789d80 R15: ffff9a7a03a90820
kernel: FS: 0000000000000000(0000) GS:ffff9a8160e40000(0000) knlGS:0000000000000000
kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kernel: CR2: 00007f49c42b76e0 CR3: 0000000223c10003 CR4: 00000000003726e0
kernel: Call Trace:
kernel: <TASK>
kernel: isolate_lru_pages+0x3bf/0x480
kernel: ? shrink_page_list+0x78a/0xee0
kernel: shrink_lruvec+0x613/0xcf0
kernel: ? shrink_node+0x29d/0x700
kernel: shrink_node+0x29d/0x700
kernel: balance_pgdat+0x336/0x6f0
kernel: kswapd+0x1fd/0x3b0
kernel: ? do_wait_intr_irq+0xb0/0xb0
kernel: ? balance_pgdat+0x6f0/0x6f0
kernel: kthread+0x132/0x160
kernel: ? set_kthread_struct+0x50/0x50
kernel: ret_from_fork+0x22/0x30
kernel: </TASK>
kernel: ---[ end trace 2a731310913c0b49 ]---
kernel: ------------[ cut here ]------------
kernel: list_del corruption. prev->next should be ffffefc9cc789d88, but was ffffb39c808d3c00
kernel: WARNING: CPU: 1 PID: 135 at lib/list_debug.c:51 __list_del_entry_valid+0x94/0xc0
kernel: Modules linked in: vhost_net vhost vhost_iotlb tap tun loop snd_seq_dummy snd_hrtimer snd_seq xt_LOG nf_log_syslog xt_recent nf_conntrack_netlink br_netfilter overlay xt_CHECKSUM xt_MASQUERADE nft_chain_na>
kernel: irqbypass snd_hda_codec libarc4 uvcvideo btusb rapl snd_usbmidi_lib intel_cstate snd_hda_core btrtl snd_rawmidi i2c_i801 ipt_REJECT videobuf2_vmalloc btbcm vfat intel_uncore fat pcspkr iwlwifi e1000e i2c_>
kernel: crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd rtsx_pci xhci_pci_renesas
kernel: CPU: 1 PID: 135 Comm: kswapd0 Tainted: G W 5.15.5-arch1-1 #1 f0168f793e3f707b46715a62fafabd6a40826924
kernel: Hardware name: Intel(R) Client Systems NUC8i5BEH/NUC8BEB, BIOS BECFL357.86A.0089.2021.0621.1343 06/21/2021
kernel: RIP: 0010:__list_del_entry_valid+0x94/0xc0
kernel: Code: c7 70 67 eb ba e8 80 bd 62 00 0f 0b 31 c0 31 d2 89 d6 89 d7 41 89 d0 c3 48 89 f2 48 89 fe 48 c7 c7 a8 67 eb ba e8 60 bd 62 00 <0f> 0b 31 c0 31 d2 89 d6 89 d7 41 89 d0 c3 48 c7 c7 e8 67 eb ba e8
kernel: RSP: 0018:ffffb39c808d3ab8 EFLAGS: 00010046
kernel: RAX: 0000000000000000 RBX: 0000000000000007 RCX: 0000000000000000
kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
kernel: RBP: 0000000000000007 R08: 0000000000000000 R09: 0000000000000000
kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffffefc9cc789d88
kernel: R13: 0000000000000001 R14: ffffefc9cc789d80 R15: ffff9a7a03a90820
kernel: FS: 0000000000000000(0000) GS:ffff9a8160e40000(0000) knlGS:0000000000000000
kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kernel: CR2: 00007f49c42b76e0 CR3: 0000000223c10003 CR4: 00000000003726e0
kernel: Call Trace:
kernel: <TASK>
kernel: isolate_lru_pages+0x39a/0x480
kernel: ? shrink_page_list+0x78a/0xee0
kernel: shrink_lruvec+0x613/0xcf0
kernel: ? shrink_node+0x29d/0x700
kernel: shrink_node+0x29d/0x700
kernel: balance_pgdat+0x336/0x6f0
kernel: kswapd+0x1fd/0x3b0
kernel: ? do_wait_intr_irq+0xb0/0xb0
kernel: ? balance_pgdat+0x6f0/0x6f0
kernel: kthread+0x132/0x160
kernel: ? set_kthread_struct+0x50/0x50
kernel: ret_from_fork+0x22/0x30
kernel: </TASK>
kernel: ---[ end trace 2a731310913c0b4a ]---


Steps to reproduce:
Install flatpak
Install https://flathub.org/apps/details/org.gnome.gitlab.YaLTeR.VideoTrimmer
Open video trimmer and select a video to edit.
See kernel errors on logs immediately

Running the video trimmer app without flatpak does not produce errors.

This task depends upon

Closed by  Toolybird (Toolybird)
Tuesday, 06 June 2023, 04:17 GMT
Reason for closing:  Fixed
Additional comments about closing:  See comments
Comment by Dimitris Levogiannis (dimlev) - Thursday, 02 December 2021, 19:38 GMT
Correction: The arch running in the vm generated kernel errors during system shutdown.
Comment by Minmo (Minmo) - Thursday, 02 December 2021, 21:44 GMT
Edit: Maybe related to this:

https://www.spinics.net/lists/netdev/msg783650.html

I'm encountering the same issue on the following system:
Arch Linux x86_64
5.10.82-1-lts

CPU: AMD Ryzen 9 3900X (24) @ 3.800GHz
GPU: NVIDIA GeForce GTX 1080 Ti
Memory: 64305MiB

Flatpak version: Flatpak 1.12.2

logs:

```
Dec 02 22:05:19 cellaris kernel: ------------[ cut here ]------------
Dec 02 22:05:19 cellaris kernel: list_add corruption. next->prev should be prev (ffff950bc841f420), but was fffff9011462e288. (next=fffff901359acc08).
Dec 02 22:05:19 cellaris kernel: WARNING: CPU: 10 PID: 202 at lib/list_debug.c:23 __list_add_valid+0x33/0x70
Dec 02 22:05:19 cellaris kernel: Modules linked in: snd_seq_dummy snd_hrtimer snd_seq uinput nvidia_drm(POE) nvidia_modeset(POE) nvidia_uvm(POE) nvidia(POE) wmi_bmof snd_hda_codec_realtek xfs snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi snd_hda_>
Dec 02 22:05:19 cellaris kernel: crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd glue_helper ccp rng_core xhci_pci xhci_pci_renesas
Dec 02 22:05:19 cellaris kernel: CPU: 10 PID: 202 Comm: kswapd0 Tainted: P W OE 5.10.82-1-lts #1
Dec 02 22:05:19 cellaris kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X570 Pro4, BIOS P1.40 08/12/2019
Dec 02 22:05:19 cellaris kernel: RIP: 0010:__list_add_valid+0x33/0x70
Dec 02 22:05:19 cellaris kernel: Code: f2 75 18 4c 8b 0a 4d 39 c1 75 24 48 39 fa 74 39 49 39 f9 74 34 b8 01 00 00 00 c3 4c 89 c1 48 c7 c7 40 b0 59 b2 e8 30 5e 55 00 <0f> 0b 31 c0 c3 48 89 d1 4c 89 c6 4c 89 ca 48 c7 c7 90 b0 59 b2 e8
Dec 02 22:05:19 cellaris kernel: RSP: 0018:ffffb72d4077fad8 EFLAGS: 00010086
Dec 02 22:05:19 cellaris kernel: RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000027
Dec 02 22:05:19 cellaris kernel: RDX: ffff951a5eca0bb8 RSI: 0000000000000001 RDI: ffff951a5eca0bb0
Dec 02 22:05:19 cellaris kernel: RBP: fffff90107daaac0 R08: 0000000000000000 R09: ffffb72d4077f8f8
Dec 02 22:05:19 cellaris kernel: R10: ffffb72d4077f8f0 R11: ffff951a9f28e180 R12: 0000000000000002
Dec 02 22:05:19 cellaris kernel: R13: ffff950bc841f400 R14: fffff901359acc08 R15: fffff90107daaac8
Dec 02 22:05:19 cellaris kernel: FS: 0000000000000000(0000) GS:ffff951a5ec80000(0000) knlGS:0000000000000000
Dec 02 22:05:19 cellaris kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Dec 02 22:05:19 cellaris kernel: CR2: 00007f80bd5cdda0 CR3: 000000010a670000 CR4: 0000000000350ee0
Dec 02 22:05:19 cellaris kernel: Call Trace:
Dec 02 22:05:19 cellaris kernel: move_pages_to_lru.isra.0+0x284/0x510
Dec 02 22:05:19 cellaris kernel: shrink_inactive_list+0x1ae/0x410
Dec 02 22:05:19 cellaris kernel: shrink_lruvec+0x466/0x720
Dec 02 22:05:19 cellaris kernel: ? do_shrink_slab+0x4d/0x240
Dec 02 22:05:19 cellaris kernel: ? vmpressure+0x56/0x110
Dec 02 22:05:19 cellaris kernel: shrink_node+0x2b1/0x6d0
Dec 02 22:05:19 cellaris kernel: balance_pgdat+0x2fc/0x610
Dec 02 22:05:19 cellaris kernel: kswapd+0x1fb/0x380
Dec 02 22:05:19 cellaris kernel: ? add_wait_queue_exclusive+0x70/0x70
Dec 02 22:05:19 cellaris kernel: ? balance_pgdat+0x610/0x610
Dec 02 22:05:19 cellaris kernel: kthread+0x11b/0x140
Dec 02 22:05:19 cellaris kernel: ? kthread_associate_blkcg+0xa0/0xa0
Dec 02 22:05:19 cellaris kernel: ret_from_fork+0x22/0x30
Dec 02 22:05:19 cellaris kernel: ---[ end trace cc26270002e42f06 ]---
```
Comment by Dimitris Levogiannis (dimlev) - Wednesday, 08 December 2021, 08:56 GMT
Kernel 5.15.6-arch2-1 seems to fix the issue for me.

Loading...