FS#23725 - [kernel26] Since update to 2.6.38-2 udevs ata_id slows the system to a crawl
Attached to Project:
Arch Linux
Opened by heiko (heiko) - Tuesday, 12 April 2011, 21:19 GMT
Last edited by Andrea Scarpino (BaSh) - Saturday, 30 April 2011, 11:33 GMT
Opened by heiko (heiko) - Tuesday, 12 April 2011, 21:19 GMT
Last edited by Andrea Scarpino (BaSh) - Saturday, 30 April 2011, 11:33 GMT
|
Details
Description:
Since the update of kernel26 from 2.6.37-5 to 2.6.38-2 my PC (x86_64) is incredibly slow - bootup time multiplied by a factor of ten to twenty, maybe even more. The slowdown begins during my initramfs udev module loading stage, when also a constant ticking sound of my CD-writer starts and the CD-writer does not open it's tray any more. The kernel spits out the following error message about five or six times per second and when the system eventually boots up, it is still pretty sluggish: scsi2: Issued Channel A Bus Reset. 1 SCBs aborted (scsi2:A:0:0): No or incomplete CDB sent to device. The probable cause is udev's /lib/udev/ata_id, called by the rule tagged "ATA/ATAPI devices using the "scsi" subsystem" in /lib/udev/rules.d/60-persistent-storage.rules for my pure-SCSI CD-writer Plextor PX-W124TS on my Adaptec 2940-something SCSI-card using the aic7xxx driver. I'm totally unsure whether this is a kernel bug (should abort somehow), a udev bug (should not call ata_id on a SCSI device) or a packaging bug (causing udev to call ata_id on a SCSI device in the first place) so I'm reporting it here first... Downgrading the kernel26 back to 2.6.37-5 magically solves the problem. I'm not sure whether the same happens with i686, so I set the architecture field to x86_64 for now. I am willing to test new kernel versions once they are out, patched versions of udev, anything. Might take a day or two though, time can be short. Might start some testing for myself tomorrow, such as disabling that particular udev rule for a start. Regards, Heiko Steps to reproduce: 1) Run a 2.6.38-2 kernel with initramfs on a system containing an Adaptec 2940 with a Plextor PX-W124TS attached.* 2) Watch it spit out error messages. *My guess is that this is going to happen as well if you don't use an initramfs, only later during the "regular" bootup stage. Can't test due to encrypted root filesystem. And probably also on other SCSI cards, maybe even on other CD writers - can't test either due to lack of both... |
This task depends upon
Closed by Andrea Scarpino (BaSh)
Saturday, 30 April 2011, 11:33 GMT
Reason for closing: Fixed
Additional comments about closing: kernel26 2.6.38.3-1
Saturday, 30 April 2011, 11:33 GMT
Reason for closing: Fixed
Additional comments about closing: kernel26 2.6.38.3-1
Hardware - Intel D865GBF Mobo/Adaptec AHA2940AU/Plextor PX-40TSi.
We have the same SCSI config. Machine won't boot at all - just gets stuck in an endless repetition of:
scsi2: Issued Channel A Bus Reset. 1 SCBs aborted
(scsi2:A:0:0): No or incomplete CDB sent to device.
Previous kernel and Arch core ISO both will boot and work fine.
EDIT:
Swapped out AHA2940 for a Mylex Flashpoint LT (BT-930R) and all works well, suggesting the problem does lie with the aic7xxx driver in the 2.6.38 kernel.
EDIT 2:
Kernel 2.6.38.3 released today, one week after bug report. Problem has disappeared (for me). Excellent. Thanks Tobias/Thomas.
# uname -a
Linux smokey 2.6.38-ARCH #1 SMP PREEMPT Sun Apr 17 14:51:34 UTC 2011 i686 Pentium III (Coppermine) GenuineIntel GNU/Linux
# pacman -Q kernel26
kernel26 2.6.38.3-1
The CD drive in question:
# lsscsi --verbose
[0:0:5:0] cd/dvd NEC CD-ROM DRIVE:466 1.06 /dev/sr0
dir: /sys/bus/scsi/devices/0:0:5:0 [/sys/devices/pci0000:00/0000:00:02.0/0000:01:06.0/host0/target0:0:5/0:0:5:0]
...
It's on an Adaptec aic7880 ultrascsi card, using driver aic7xxx.
From dmesg:
[ 19.297170] (scsi0:A:5:0): No or incomplete CDB sent to device.
[ 19.300480] scsi0: Issued Channel A Bus Reset. 1 SCBs aborted
(over and over again)
This continues until I hit alt-sysrq-E to terminate all tasks, and only then will my boot continue.
I can give more logs if needed.
Jelly,
This is the most intimate I've ever been with a kernel bug.
From my viewpoint a fairly serious breakage in the first arch 2.6.38 kernel was fixed with the subsequent 2.6.38.3 release.
I could find no mention of this aic7xxx problem in LKML and the checksums of all the aic7xxx source code in both releases are identical.
Apart from that I wouldn't know how else to investigate the issue. Any advice would be appreciated.
ide: unexport DISK_EVENT_MEDIA_CHANGE for ide-gd and ide-cd
check_events() implementations in both ide-gd and ide-cd are
inadequate for in-kernel event polling. Both generate media change
events continuously when certain conditions are met causing infinite
event loop between the driver and userland event handler.
As disk event now supports suppression of unlisted events, simply
de-listing DISK_EVENT_MEDIA_CHANGE from disk->events resolves the
problem. Internal handling around media revalidation will behave the
same while userland will fall back to userland event polling after
detecting the device doesn't support disk events.
Jelle: No, I did not report this upstream. Frankly, I was hoping for someone else to do so and thereby spare me the trouble of registering with yet another bugzilla. (On a side note: This "subscribe to this mailing list, register for that bug tracker, join yet another Yahoo Group" is not helping in getting bugs reported to the correct places. I have no better plan available, just don't like the way it is...)
Was about to try now when I noticed bugzilla.kernel.org is down and it looks like the bug disappeared in 2.6.38-3 anyway.
Tom: I don't think this is related because my system is not using any ide-cd-drivers (running on pure SCSI that is) I can't rule it out, though.
Regards, Heiko