FS#79644 - [linux] Kernel 6.5.2 Causes Marvell Technology Group 88SE9128 PCIe SATA to Constantly Reset

Attached to Project: Arch Linux
Opened by patenteng (patenteng) - Monday, 11 September 2023, 01:42 GMT
Last edited by Toolybird (Toolybird) - Monday, 18 September 2023, 22:40 GMT
Task Type Bug Report
Category Kernel
Status Closed
Assigned To No-one
Architecture All
Severity Medium
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

After upgrading to 6.5.2 I keep getting the following kernel messages around three times per second:

[ 9683.269830] ata16: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[ 9683.270399] ata16.00: configured for UDMA/66

So I've tracked the offending device:

ll /sys/class/ata_port/ata16
lrwxrwxrwx 1 root root 0 Sep 10 21:51 /sys/class/ata_port/ata16 -> ../../devices/pci0000:00/0000:00:1c.7/0000:0a:00.0/ata16/ata_port/ata16

cat /sys/bus/pci/devices/0000:0a:00.0/uevent
DRIVER=ahci
PCI_CLASS=10601
PCI_ID=1B4B:9130
PCI_SUBSYS_ID=1043:8438
PCI_SLOT_NAME=0000:0a:00.0
MODALIAS=pci:v00001B4Bd00009130sv00001043sd00008438bc01sc06i01

lspci | grep 0a:00.0
0a:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9128 PCIe SATA 6 Gb/s RAID controller with HyperDuo (rev 11)

I am not using the 88SE9128, so I have no way of knowing whether it works or not. It may simply be getting reset a couple of times per second or it may not function at all. The problem is not present in older kernels, such as the LTS kernel.
This task depends upon

Closed by  Toolybird (Toolybird)
Monday, 18 September 2023, 22:40 GMT
Reason for closing:  Upstream
Additional comments about closing:  It needs bisection as requested by upstream. If not prepared to do this yourself, please consider the generous offer from @loqs via the forum.
Comment by Toolybird (Toolybird) - Monday, 11 September 2023, 02:44 GMT
> The problem is not present in older kernels

Therefore it's a kernel regression. The standard debugging advice [1]. You'll likely have better luck by reporting this upstream to the kernel folks. I couldn't find much related to this issue when searching around online. Please let us know what you find out.

[1] https://wiki.archlinux.org/title/Kernel#Debugging_regressions
Comment by patenteng (patenteng) - Monday, 11 September 2023, 14:02 GMT
I've looked through the pacman logs. The issue started when I updated from kernel 6.4.12 to 6.5.2. I'll submit a bug to the kernel upstream to see what they say.

Currently I an unbinding the driver at every reboot with sudo sh -c 'echo "0000:0a:00.0" > /sys/bus/pci/drivers/ahci/unbind'. This stops the log spam.
Comment by patenteng (patenteng) - Monday, 11 September 2023, 14:15 GMT
I've submitted a bug report upstream: https://bugzilla.kernel.org/show_bug.cgi?id=217902.
Comment by loqs (loqs) - Tuesday, 12 September 2023, 21:59 GMT
If it helps with the bisection you have been asked to carry out, you can find the first bisection point in [1].
Please ask if you need further kernels to be built.

[1] https://bbs.archlinux.org/viewtopic.php?pid=2120389#p2120389
Comment by loqs (loqs) - Wednesday, 13 September 2023, 15:54 GMT Comment by loqs (loqs) - Friday, 15 September 2023, 17:01 GMT

Loading...