FS#79427 - [linux] 6.4.11-arch1-1 TPM problem prevents login

Attached to Project: Arch Linux
Opened by Bernd Amend (ptb) - Sunday, 20 August 2023, 19:47 GMT
Last edited by Toolybird (Toolybird) - Saturday, 09 September 2023, 01:08 GMT
Task Type Bug Report
Category Kernel
Status Closed
Assigned To Jan Alexander Steffens (heftig)
Levente Polyak (anthraxx)
Architecture All
Severity Medium
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

Description:
When using the linux kernel 6.4.11-arch1-1 or 6.4.11-arch2-1 I cannot log in (Wayland, X11, Console, SSH) to the system anymore.
Booting the system with linux-rt 6.3.3.15.realtime2-4 works perfectly.

If I try to log in using the console, I get a short message that is removed to fast to read it. Wayland just hangs forever and I need to press the power button for 5 seconds.
The only error in journalctl is

Aug 20 21:14:13 pc kernel: ------------[ cut here ]------------
Aug 20 21:14:13 pc kernel: WARNING: CPU: 1 PID: 1 at kernel/workqueue.c:3185 __flush_work.isra.0+0x270/0x280
Aug 20 21:14:13 pc kernel: Modules linked in:
Aug 20 21:14:13 pc kernel: CPU: 1 PID: 1 Comm: swapper/0 Not tainted 6.4.11-arch2-1 #1 97b2f722c7732577cb713428e1f14bfdbe1faa91
Aug 20 21:14:13 pc kernel: Hardware name: Dell Inc. XPS 15 9560/0YH90J, BIOS 1.31.0 11/10/2022
Aug 20 21:14:13 pc kernel: RIP: 0010:__flush_work.isra.0+0x270/0x280
Aug 20 21:14:13 pc kernel: Code: 8b 04 25 00 37 03 00 48 89 44 24 40 48 8b 73 30 8b 4b 28 e9 e3 fe ff ff 40 30 f6 4c 8b 3e e9 21 fe ff ff 0f 0b e9 3a ff ff ff <0f> 0b e9 33 ff ff ff e8 34 1f c9 00 0f 1f 40 00 90 90 90 90 90 90
Aug 20 21:14:13 pc kernel: RSP: 0000:ffffb47cc006bb80 EFLAGS: 00010246
Aug 20 21:14:13 pc kernel: RAX: 0000000000000000 RBX: ffff9e0e40ea0828 RCX: 0000000000000000
Aug 20 21:14:13 pc kernel: RDX: 0000000000000004 RSI: ffffb47cc0700008 RDI: ffffb47cc006bbc8
Aug 20 21:14:13 pc kernel: RBP: ffff9e0e40ea0868 R08: 0000000000000002 R09: 0000000000000000
Aug 20 21:14:13 pc kernel: R10: 0000000000000001 R11: 0000000000000100 R12: ffff9e0e41f32000
Aug 20 21:14:13 pc kernel: R13: ffffb47cc006bb80 R14: 0000000000000001 R15: ffff9e0e41f87c10
Aug 20 21:14:13 pc kernel: FS: 0000000000000000(0000) GS:ffff9e159e440000(0000) knlGS:0000000000000000
Aug 20 21:14:13 pc kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug 20 21:14:13 pc kernel: CR2: 0000000000000000 CR3: 00000005f7820001 CR4: 00000000003706e0
Aug 20 21:14:13 pc kernel: Call Trace:
Aug 20 21:14:13 pc kernel: <TASK>
Aug 20 21:14:13 pc kernel: ? __flush_work.isra.0+0x270/0x280
Aug 20 21:14:13 pc kernel: ? __warn+0x81/0x130
Aug 20 21:14:13 pc kernel: ? __flush_work.isra.0+0x270/0x280
Aug 20 21:14:13 pc kernel: ? report_bug+0x171/0x1a0
Aug 20 21:14:13 pc kernel: ? handle_bug+0x3c/0x80
Aug 20 21:14:13 pc kernel: ? exc_invalid_op+0x17/0x70
Aug 20 21:14:13 pc kernel: ? asm_exc_invalid_op+0x1a/0x20
Aug 20 21:14:13 pc kernel: ? __flush_work.isra.0+0x270/0x280
Aug 20 21:14:13 pc kernel: tpm_tis_remove+0xaa/0x100
Aug 20 21:14:13 pc kernel: tpm_tis_core_init+0x235/0xa00
Aug 20 21:14:13 pc kernel: tpm_tis_plat_probe+0xda/0x120
Aug 20 21:14:13 pc kernel: platform_probe+0x41/0xa0
Aug 20 21:14:13 pc kernel: really_probe+0x19b/0x3e0
Aug 20 21:14:13 pc kernel: ? __pfx___driver_attach+0x10/0x10
Aug 20 21:14:13 pc kernel: __driver_probe_device+0x78/0x160
Aug 20 21:14:13 pc kernel: driver_probe_device+0x1f/0x90
Aug 20 21:14:13 pc kernel: __driver_attach+0xd2/0x1c0
Aug 20 21:14:13 pc kernel: bus_for_each_dev+0x85/0xd0
Aug 20 21:14:13 pc kernel: bus_add_driver+0x116/0x220
Aug 20 21:14:13 pc kernel: driver_register+0x59/0x100
Aug 20 21:14:13 pc kernel: ? __pfx_init_tis+0x10/0x10
Aug 20 21:14:13 pc kernel: init_tis+0x34/0x100
Aug 20 21:14:13 pc kernel: ? __pfx_init_tis+0x10/0x10
Aug 20 21:14:13 pc kernel: do_one_initcall+0x5a/0x240
Aug 20 21:14:13 pc kernel: kernel_init_freeable+0x1d4/0x320
Aug 20 21:14:13 pc kernel: ? __pfx_kernel_init+0x10/0x10
Aug 20 21:14:13 pc kernel: kernel_init+0x1a/0x1c0
Aug 20 21:14:13 pc kernel: ret_from_fork+0x29/0x50
Aug 20 21:14:13 pc kernel: </TASK>
Aug 20 21:14:13 pc kernel: ---[ end trace 0000000000000000 ]---
Aug 20 21:14:13 pc kernel: tpm_tis: probe of MSFT0101:00 failed with error -1
Aug 20 21:14:13 pc kernel: AMD-Vi: AMD IOMMUv2 functionality not available on this system - This is not a bug.
Aug 20 21:14:13 pc kernel: ACPI: bus type drm_connector registered
Aug 20 21:14:13 pc kernel: ahci 0000:00:17.0: version 3.0
Aug 20 21:14:13 pc kernel: ahci 0000:00:17.0: SSS flag set, parallel bus scan disabled
Aug 20 21:14:13 pc kernel: ahci 0000:00:17.0: AHCI 0001.0301 32 slots 1 ports 6 Gbps 0x2 impl SATA mode
Aug 20 21:14:13 pc kernel: ahci 0000:00:17.0: flags: 64bit ncq sntf stag pm led clo only pio slum part ems deso sadm sds apst
Aug 20 21:14:13 pc kernel: scsi host0: ahci
Aug 20 21:14:13 pc kernel: scsi host1: ahci
Aug 20 21:14:13 pc kernel: ata1: DUMMY
Aug 20 21:14:13 pc kernel: ata2: SATA max UDMA/133 abar m2048@0xed133000 port 0xed133180 irq 131
Aug 20 21:14:13 pc kernel: usbcore: registered new interface driver usbserial_generic
Aug 20 21:14:13 pc kernel: usbserial: USB Serial support registered for generic
Aug 20 21:14:13 pc kernel: rtc_cmos 00:02: RTC can wake from S4
Aug 20 21:14:13 pc kernel: rtc_cmos 00:02: registered as rtc0
Aug 20 21:14:13 pc kernel: rtc_cmos 00:02: setting system clock to 2023-08-20T19:14:09 UTC (1692558849)
Aug 20 21:14:13 pc kernel: rtc_cmos 00:02: alarms up to one month, y3k, 242 bytes nvram
Aug 20 21:14:13 pc kernel: intel_pstate: Intel P-state driver initializing
Aug 20 21:14:13 pc kernel: intel_pstate: Disabling energy efficiency optimization
Aug 20 21:14:13 pc kernel: intel_pstate: HWP enabled
Aug 20 21:14:13 pc kernel: ledtrig-cpu: registered to indicate activity on CPUs
Aug 20 21:14:13 pc kernel: [drm] Initialized simpledrm 1.0.0 20200625 for simple-framebuffer.0 on minor 0
Aug 20 21:14:13 pc kernel: Console: switching to colour frame buffer device 240x67
Aug 20 21:14:13 pc kernel: Linux agpgart interface v0.103
Aug 20 21:14:13 pc kernel: ACPI: battery: Slot [BAT0] (battery present)
Aug 20 21:14:13 pc kernel: Freeing initrd memory: 21376K
Aug 20 21:14:13 pc kernel: ------------[ cut here ]------------
Aug 20 21:14:13 pc kernel: WARNING: CPU: 1 PID: 1 at kernel/workqueue.c:3185 __flush_work.isra.0+0x270/0x280
Aug 20 21:14:13 pc kernel: Modules linked in:
Aug 20 21:14:13 pc kernel: CPU: 1 PID: 1 Comm: swapper/0 Not tainted 6.4.11-arch2-1 #1 97b2f722c7732577cb713428e1f14bfdbe1faa91
Aug 20 21:14:13 pc kernel: Hardware name: Dell Inc. XPS 15 9560/0YH90J, BIOS 1.31.0 11/10/2022
Aug 20 21:14:13 pc kernel: RIP: 0010:__flush_work.isra.0+0x270/0x280
Aug 20 21:14:13 pc kernel: Code: 8b 04 25 00 37 03 00 48 89 44 24 40 48 8b 73 30 8b 4b 28 e9 e3 fe ff ff 40 30 f6 4c 8b 3e e9 21 fe ff ff 0f 0b e9 3a ff ff ff <0f> 0b e9 33 ff ff ff e8 34 1f c9 00 0f 1f 40 00 90 90 90 90 90 90
Aug 20 21:14:13 pc kernel: RSP: 0000:ffffb47cc006bb80 EFLAGS: 00010246
Aug 20 21:14:13 pc kernel: RAX: 0000000000000000 RBX: ffff9e0e40ea0828 RCX: 0000000000000000
Aug 20 21:14:13 pc kernel: RDX: 0000000000000004 RSI: ffffb47cc0700008 RDI: ffffb47cc006bbc8
Aug 20 21:14:13 pc kernel: RBP: ffff9e0e40ea0868 R08: 0000000000000002 R09: 0000000000000000
Aug 20 21:14:13 pc kernel: R10: 0000000000000001 R11: 0000000000000100 R12: ffff9e0e41f32000
Aug 20 21:14:13 pc kernel: R13: ffffb47cc006bb80 R14: 0000000000000001 R15: ffff9e0e41f87c10
Aug 20 21:14:13 pc kernel: FS: 0000000000000000(0000) GS:ffff9e159e440000(0000) knlGS:0000000000000000
Aug 20 21:14:13 pc kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug 20 21:14:13 pc kernel: CR2: 0000000000000000 CR3: 00000005f7820001 CR4: 00000000003706e0
Aug 20 21:14:13 pc kernel: Call Trace:
Aug 20 21:14:13 pc kernel: <TASK>
Aug 20 21:14:13 pc kernel: ? __flush_work.isra.0+0x270/0x280
Aug 20 21:14:13 pc kernel: ? __warn+0x81/0x130
Aug 20 21:14:13 pc kernel: ? __flush_work.isra.0+0x270/0x280
Aug 20 21:14:13 pc kernel: ? report_bug+0x171/0x1a0
Aug 20 21:14:13 pc kernel: ? handle_bug+0x3c/0x80
Aug 20 21:14:13 pc kernel: ? exc_invalid_op+0x17/0x70
Aug 20 21:14:13 pc kernel: ? asm_exc_invalid_op+0x1a/0x20
Aug 20 21:14:13 pc kernel: ? __flush_work.isra.0+0x270/0x280
Aug 20 21:14:13 pc kernel: tpm_tis_remove+0xaa/0x100
Aug 20 21:14:13 pc kernel: tpm_tis_core_init+0x235/0xa00
Aug 20 21:14:13 pc kernel: tpm_tis_plat_probe+0xda/0x120
Aug 20 21:14:13 pc kernel: platform_probe+0x41/0xa0
Aug 20 21:14:13 pc kernel: really_probe+0x19b/0x3e0
Aug 20 21:14:13 pc kernel: ? __pfx___driver_attach+0x10/0x10
Aug 20 21:14:13 pc kernel: __driver_probe_device+0x78/0x160
Aug 20 21:14:13 pc kernel: driver_probe_device+0x1f/0x90
Aug 20 21:14:13 pc kernel: __driver_attach+0xd2/0x1c0
Aug 20 21:14:13 pc kernel: bus_for_each_dev+0x85/0xd0
Aug 20 21:14:13 pc kernel: bus_add_driver+0x116/0x220
Aug 20 21:14:13 pc kernel: driver_register+0x59/0x100
Aug 20 21:14:13 pc kernel: ? __pfx_init_tis+0x10/0x10
Aug 20 21:14:13 pc kernel: init_tis+0x34/0x100
Aug 20 21:14:13 pc kernel: ? __pfx_init_tis+0x10/0x10
Aug 20 21:14:13 pc kernel: do_one_initcall+0x5a/0x240
Aug 20 21:14:13 pc kernel: kernel_init_freeable+0x1d4/0x320
Aug 20 21:14:13 pc kernel: ? __pfx_kernel_init+0x10/0x10
Aug 20 21:14:13 pc kernel: kernel_init+0x1a/0x1c0
Aug 20 21:14:13 pc kernel: ret_from_fork+0x29/0x50
Aug 20 21:14:13 pc kernel: </TASK>
Aug 20 21:14:13 pc kernel: ---[ end trace 0000000000000000 ]---
Aug 20 21:14:13 pc kernel: tpm_tis: probe of MSFT0101:00 failed with error -1
Aug 20 21:14:13 pc kernel: AMD-Vi: AMD IOMMUv2 functionality not available on this system - This is not a bug.

Additional info:
* package version(s) 6.4.11-arch1-1, 6.4.11-arch2-1
* config and/or log files etc.
* link to upstream bug report, if any

Steps to reproduce:
- Update to 6.4.11-arch2-1
- Reboot
- Try to login
This task depends upon

Closed by  Toolybird (Toolybird)
Saturday, 09 September 2023, 01:08 GMT
Reason for closing:  Duplicate
Additional comments about closing:   FS#79439 
Comment by Chris Drzewiecki (cdrzewiecki) - Sunday, 20 August 2023, 20:39 GMT
I also had issues with this kernel when updating to it. My system got stuck at "Loading initial ramdisk" after rebooting to load the new kernel. Unfortunately I don't know how to grab logs (since I wasn't able to get into the system at all post-update), but I can definitely try to do so if someone would be kind enough to point me to instructions on how to get said logs.
Comment by loqs (loqs) - Sunday, 20 August 2023, 21:07 GMT
@ptb can you try earlier releases of the linux package available from the ALA to determine which release introduced the issue? See also [2].

[1] https://wiki.archlinux.org/title/Arch_Linux_Archive
[2] https://wiki.archlinux.org/title/Kernel#Debugging_regressions
Comment by Bernd Amend (ptb) - Sunday, 20 August 2023, 21:43 GMT
It works with <=6.4.10-arch1-1. The issue was introduced in version 6.4.11-arch1-1.
Bisecting the issue will take a while. But I noticed that the release log contains a number of tpm_tis changes https://cdn.kernel.org/pub/linux/kernel/v6.x/ChangeLog-6.4.11
Comment by Toolybird (Toolybird) - Monday, 21 August 2023, 00:24 GMT
Yeah, looks like another TPM related regression (but this time tpm_tis). It will need to be reported upstream.
Comment by loqs (loqs) - Monday, 21 August 2023, 02:11 GMT
> Bisecting the issue will take a while. But I noticed that the release log contains a number of tpm_tis changes https://cdn.kernel.org/pub/linux/kernel/v6.x/ChangeLog-6.4.11

$ git bisect start -- drivers/char/tpm/
$ git bisect bad v6.4.11
$ git bisect good v6.4.10
Bisecting: 2 revisions left to test after this (roughly 2 steps)
[6b718101cd99de9d9357faeccac1f40ab6db6e0b] tpm/tpm_tis: Disable interrupts for Lenovo P620 devices

https://drive.google.com/file/d/1G0dadJkGOtEnrbZNVn_AWN7E5gWXPaqu/view?usp=sharing linux-6.4.10.r3.g6b718101cd99-1-x86_64.pkg.tar.zst
https://drive.google.com/file/d/1kMth9TecmaR273uJ0oTP__9EEo0-UZQe/view?usp=sharing linux-headers-6.4.10.r3.g6b718101cd99-1-x86_64.pkg.tar.zst
Comment by Jan Alexander Steffens (heftig) - Tuesday, 22 August 2023, 07:35 GMT
Please try booting with tpm_tis.interrupts=1 in your kernel command line.
Comment by Raymond Jay Golo (intersectRaven) - Saturday, 02 September 2023, 09:20 GMT
This might be the same I've experienced. Please try using the patch found here:

https://lore.kernel.org/linux-integrity/20230822231510.2263255-1-jarkko%40kernel.org/
Comment by loqs (loqs) - Saturday, 02 September 2023, 10:13 GMT
@intersectRaven does your issue produce the same kernel messages as in the original report?
Arch has carried v1 of that patch in v6.4.11-arch2 [1] and v6.4.12-arch1 [2] it has v2 in 6.5-arch1 [3]. Is your system affected using one of those versions and does updating the patch to v3 solve the issue for you on one of those versions?

[1] https://github.com/archlinux/linux/commit/566957ccbb2a6d871dcb7918321a22a4b9c83732
[2] https://github.com/archlinux/linux/commit/e5f6f3e36b5e3a3bae8d890f70d3508edc4049be
[3] https://github.com/archlinux/linux/commit/82cb6b9307c8ad38725e47a3e704d13502df724f
Comment by Raymond Jay Golo (intersectRaven) - Saturday, 02 September 2023, 10:58 GMT
@loqs No. I didn't know that the v1 of the patch was already in the current kernels. Please ignore my comment.
Comment by loqs (loqs) - Tuesday, 05 September 2023, 08:56 GMT
@ptb is the issue still present in 6.5.1.arch1-1 (currently in core testing) if so how is the bisection progressing?
Comment by Bernd Amend (ptb) - Thursday, 07 September 2023, 19:17 GMT
Sorry for the long delay, I got sick and needed some time to recover.
- tpm_tis.interrupts=0 same issue
- tpm_tis.interrupts=1 same issue
The issue still exists with 6.5.2-arch1-1 from core-testing.
Comment by loqs (loqs) - Thursday, 07 September 2023, 19:37 GMT
How is the bisection progressing? Or you could try linux-6.4.10.r3.g6b718101cd99 from [1]

[1] https://bugs.archlinux.org/task/79427#comment221166
Comment by Bernd Amend (ptb) - Thursday, 07 September 2023, 19:43 GMT
The package from [1] works without any issues.
[1] https://bugs.archlinux.org/task/79427#comment221166
Comment by loqs (loqs) - Thursday, 07 September 2023, 21:33 GMT
$ git bisect good
Bisecting: 0 revisions left to test after this (roughly 1 step)
[27722a5a5c30a4d4d426646fbc2673dded611124] tpm: tpm_tis: Fix UPX-i11 DMI_MATCH condition

https://drive.google.com/file/d/1EsPdts-NLQ6j6d7iYURfxax5yfac5IDD/view?usp=sharing linux-6.4.10.r47.g27722a5a5c30-1-x86_64.pkg.tar.zst
https://drive.google.com/file/d/17mWzFydlbx-0WTz0zsoApINW4w20bADV/view?usp=sharing linux-headers-6.4.10.r47.g27722a5a5c30-1-x86_64.pkg.tar.zst
Comment by Bernd Amend (ptb) - Thursday, 07 September 2023, 21:49 GMT
This version is bad, but the error is a little bit different.
- I still the error message I reported earlier
- I can login
- The system is extremely slow, booting takes nearly a minute and all applications are so slow that they are unusable.
Comment by loqs (loqs) - Thursday, 07 September 2023, 22:12 GMT
$ git bisect bad
Bisecting: 0 revisions left to test after this (roughly 0 steps)
[d75c2b5e06bc802c4bffa96b75a800e7f8b5de15] tpm: Add a helper for checking hwrng enabled

https://drive.google.com/file/d/1as0Sc-ogou48_FGNyvOZSrmZpQvc5W_L/view?usp=sharing linux-6.4.10.r4.gd75c2b5e06bc-1-x86_64.pkg.tar.zst
https://drive.google.com/file/d/1VyxyWb1aLDa654wDPsJa7bKWCuSBneqR/view?usp=sharing linux-headers-6.4.10.r4.gd75c2b5e06bc-1-x86_64.pkg.tar.zst
Comment by Bernd Amend (ptb) - Friday, 08 September 2023, 05:25 GMT
This version is a little more interesting:
- I can log in without issues and everything works as expected
- The message I initially reported is visible in dmesg
Comment by Bernd Amend (ptb) - Friday, 08 September 2023, 05:33 GMT
I assume this could mean the dmesg message and the login issue could be unrelated.
Comment by loqs (loqs) - Friday, 08 September 2023, 06:55 GMT
> I assume this could mean the dmesg message and the login issue could be unrelated.
Yes and that would mean the bisection which was targeting tpm will need to be adjusted.
Lets check what happens reverting the commits the tpm bisection identified as the cause.

If the last test was good then:
$ git bisect good
27722a5a5c30a4d4d426646fbc2673dded611124 is the first bad commit
commit 27722a5a5c30a4d4d426646fbc2673dded611124
Author: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Date: Tue Aug 8 12:48:36 2023 +0300

tpm: tpm_tis: Fix UPX-i11 DMI_MATCH condition

commit 51e5e551af53259e0274b0cd4ff83d8351fb8c40 upstream.

The patch which made it to the kernel somehow changed the
match condition from
DMI_MATCH(DMI_PRODUCT_NAME, "UPX-TGL01")
to
DMI_MATCH(DMI_PRODUCT_VERSION, "UPX-TGL")

Revert back to the correct match condition to disable the
interrupt mode on the board.

Cc: stable@vger.kernel.org # v6.4+
Fixes: edb13d7bb034 ("tpm: tpm_tis: Disable interrupts *only* for AEON UPX-i11")
Link:20230524085844.11580-1-peter.ujfalusi@linux.intel.com/"> https://lore.kernel.org/lkml/20230524085844.11580-1-peter.ujfalusi@linux.intel.com/
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drivers/char/tpm/tpm_tis.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

https://drive.google.com/file/d/1rp6CjvLzEfHOdOgGM3UchIxZ8Xl94F7K/view?usp=sharing linux-6.4.11.arch1-1.1-x86_64.pkg.tar.zst
https://drive.google.com/file/d/18m372OMzfjFigWHGfB2LqvH65RmZs6Ze/view?usp=sharing linux-headers-6.4.11.arch1-1.1-x86_64.pkg.tar.zst

If the last test was bad then:
$ git bisect bad
d75c2b5e06bc802c4bffa96b75a800e7f8b5de15 is the first bad commit
commit d75c2b5e06bc802c4bffa96b75a800e7f8b5de15
Author: Mario Limonciello <mario.limonciello@amd.com>
Date: Mon Aug 7 23:12:29 2023 -0500

tpm: Add a helper for checking hwrng enabled

commit cacc6e22932f373a91d7be55a9b992dc77f4c59b upstream.

The same checks are repeated in three places to decide whether to use
hwrng. Consolidate these into a helper.

Also this fixes a case that one of them was missing a check in the
cleanup path.

Fixes: 554b841d4703 ("tpm: Disable RNG for all AMD fTPMs")
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drivers/char/tpm/tpm-chip.c | 19 ++++++++++++++-----
1 file changed, 14 insertions(+), 5 deletions(-)

https://drive.google.com/file/d/1knsj5wNKk8L2E1rs_OSna0-8fePSMZtP/view?usp=sharing linux-6.4.11.arch1-1.2-x86_64.pkg.tar.zst
https://drive.google.com/file/d/1hK9y8SpJxuWZmUMOJAqZB6cW2ZaulJYx/view?usp=sharing linux-headers-6.4.11.arch1-1.2-x86_64.pkg.tar.zst
Comment by Bernd Amend (ptb) - Friday, 08 September 2023, 07:54 GMT
I assume this could mean the dmesg message and the login issue could be unrelated.
Comment by loqs (loqs) - Friday, 08 September 2023, 11:49 GMT
>I assume this could mean the dmesg message and the login issue could be unrelated.
As I wrote in my last message. Yes and that would mean the bisection which was targeting tpm will need to be adjusted.
$ git bisect start
$ git bisect bad v6.4.11
As you had no issues with the first bisection kernel add that as the commit start point
$ git bisect good 6b718101cd99de9d9357faeccac1f40ab6db6e0b
Bisecting: 101 revisions left to test after this (roughly 7 steps)
[af406bdbf3b1f55979e05eda5ff8b235a1578efd] KVM: arm64: Fix hardware enable/disable flows for pKVM

https://drive.google.com/file/d/1fr095rAjwCsT72MT8SJhFEJiZZnMSc9Z/view?usp=sharing linux-6.4.10.r105.gaf406bdbf3b1-1-x86_64.pkg.tar.zst
https://drive.google.com/file/d/1cZq8fF59GG9_YQ2LqdgsM1O1Vriuynh2/view?usp=sharing linux-headers-6.4.10.r105.gaf406bdbf3b1-1-x86_64.pkg.tar.zst
Comment by Bernd Amend (ptb) - Friday, 08 September 2023, 13:12 GMT
The duplicate message I wrote at 7:54 wasn't intentional and caused by reloading this page.
linux-6.4.11.arch1-1.2-x86_64.pkg.tar.zs is bad but doesn't show the tpm message. (I checked with journalctl)
linux-6.4.10.r105.gaf406bdbf3b1-1-x86_64.pkg.tar.zst is very bad and causes nvme read and write errors.

How do you build the kernels so fast? Which package do you use to build it?
Comment by loqs (loqs) - Friday, 08 September 2023, 13:53 GMT
$ git bisect bad
Bisecting: 50 revisions left to test after this (roughly 6 steps)
[965a20ed518ace1733655c45a49d8f4199e6d467] radix tree test suite: fix incorrect allocation size for pthreads

https://drive.google.com/file/d/1-NnCxpGgZgr7Y4am73xsfCcAd9ghJdck/view?usp=sharing linux-6.4.10.r54.g965a20ed518a-1-x86_64.pkg.tar.zst
https://drive.google.com/file/d/1AqQHO1bXGdlbA03hUavzH4rF_YSoP6US/view?usp=sharing linux-headers-6.4.10.r54.g965a20ed518a-1-x86_64.pkg.tar.zst

> How do you build the kernels so fast? Which package do you use to build it?
Parallel compilation [1] 32 jobs (1 per core). Git bisection will also speed up as the number of changes in each step reduces. As I build each package in a clean chroot I do not benefit from that. The src.tar.gz is attached.
I use a separate checkout of the kernel tree to keep track of the bisection progress.

[1] https://wiki.archlinux.org/title/Makepkg#Parallel_compilation
Comment by Bernd Amend (ptb) - Friday, 08 September 2023, 14:05 GMT
965a20ed518ace1733655c45a49d8f4199e6d467 is good
Comment by loqs (loqs) - Friday, 08 September 2023, 14:28 GMT
$ git bisect good
Bisecting: 25 revisions left to test after this (roughly 5 steps)
[ce1ebdd6e63930d351e25f7b1cb4745d5ffcdf48] usb: typec: altmodes/displayport: Signal hpd when configuring pin assignment

https://drive.google.com/file/d/1mT7URzGsazJswAxkzUJE5JDmQFSjDHzg/view?usp=sharing linux-6.4.10.r79.gce1ebdd6e639-1-x86_64.pkg.tar.zst
https://drive.google.com/file/d/1--21CR_1tZnliUdHZZMNFnjNtlfV74rb/view?usp=sharing linux-headers-6.4.10.r79.gce1ebdd6e639-1-x86_64.pkg.tar.zst
Comment by Bernd Amend (ptb) - Friday, 08 September 2023, 15:17 GMT
Bad, but in a different way.
I can login but the system is unusable.
nvme nvme0: controller is down; will reset: CSTS=Oxffffffff, PCI_STATUS=OXffff
nvme nvme0: Does your device have a faulty power saving mode enabled?
nvme nvme0: Try "nvme_core.default_ps_max_latency_us=0 pcie_aspm=off" and report a bug
nvmeOn1: Read(Ox2) Q LBA 1495304280, 192 blocks, Host Aborted Command (sct 0x3 / sc 0x71)
I/0 error, dev nvme0n1, sector 1495304280 op Ox0:(READ) flags 0x80700 phys_seg 24 prio class 2
nvmeon1: Read(0x2) @ LBA 1718477648, 96 blocks, Host Aborted Command (sct 0x3 / sc 0x71)
I/0 error, dev nvme0n1. sector 1718477648 op Ox0:(READ) flags Ox80700 phys_seg 2 prio clas
nvme 0000:04:00.0: Unable to change power state from ... Read more
Comment by loqs (loqs) - Friday, 08 September 2023, 17:02 GMT
$ git bisect bad
Bisecting: 12 revisions left to test after this (roughly 4 steps)
[773cf6b3b64aa941456f3cfd3356f6ddf0d490f4] iio: cros_ec: Fix the allocation size for cros_ec_command

https://drive.google.com/file/d/1gncholD8EbuanWk9m7Sg7BN6_H6miVEe/view?usp=sharing linux-6.4.10.r66.g773cf6b3b64a-1-x86_64.pkg.tar.zst
https://drive.google.com/file/d/1XKfQ7VwxeM958z6B6w3TQPBEEFLhjXTG/view?usp=sharing linux-headers-6.4.10.r66.g773cf6b3b64a-1-x86_64.pkg.tar.zst

Does your system use the rtsx module?  FS#79439 
Comment by Bernd Amend (ptb) - Friday, 08 September 2023, 20:10 GMT
773cf6b3b64aa941456f3cfd3356f6ddf0d490f4 is good

I also have a Dell XPS 9560 :(.

My system uses the rtsx module
rtsx_pci_sdmmc 32768 0
mmc_core 262144 1 rtsx_pci_sdmmc
rtsx_pci 131072 1 rtsx_pci_sdmmc
Comment by loqs (loqs) - Friday, 08 September 2023, 20:15 GMT
Please try one of the bad kernels with the additional kernel parameters module_blacklist=rtsx_pci_sdmmc,rtsx_pci [1][2]

[1] https://wiki.archlinux.org/title/Kernel_parameters
[2] https://wiki.archlinux.org/title/Kernel_module#Using_kernel_command_line_2
Comment by Bernd Amend (ptb) - Friday, 08 September 2023, 21:20 GMT
@loqs Thank you for your support. I think this bug can be marked as a duplicate of  FS#79439 .
I'm unsure how to proceed with the tpm issue.
Comment by loqs (loqs) - Friday, 08 September 2023, 21:38 GMT
Is the tpm issue still present in 6.5.2.arch1-1 with the rtsx modules blacklisted?
Comment by Bernd Amend (ptb) - Friday, 08 September 2023, 21:41 GMT
Sorry I tested 6.4.12.arch1-1 and there the tpm issue was still reported.
The issue is gone with 6.5.2.arch1-1.
Comment by Toolybird (Toolybird) - Saturday, 09 September 2023, 01:08 GMT
IIUC, the tpm issue was a red herring and this was rtsx problem all along? Ok, closing.

PS: Awesome support again @loqs :)

> I use a separate checkout of the kernel tree to keep track of the bisection progress.

This tip is an absolute nugget! I do the same because it's the only sane way to perform bisections without screwing up your PKGBUILD. Should probably be documented somewhere...

Loading...