FS#40454 - [linux] 3.13.x - 3.15.x mptscsi taking longer than 30 seconds to probe resulting in boot fail
Attached to Project:
Arch Linux
Opened by Jason Begley (jayray) - Monday, 19 May 2014, 18:00 GMT
Last edited by Dave Reisner (falconindy) - Saturday, 13 September 2014, 14:14 GMT
Opened by Jason Begley (jayray) - Monday, 19 May 2014, 18:00 GMT
Last edited by Dave Reisner (falconindy) - Saturday, 13 September 2014, 14:14 GMT
|
Details
Description: [linux] mptscsi taking longer than 30 seconds
to probe resulting in boot fail.
After kernel upgrade to 3.13+ kernel OOPS at ~30sec mark due to timeout waiting on MPTSCSI. Additional info: * Any kernel 3.13+ * Hardware Dell 1950 PERC6 Steps to reproduce: Hardware specific. Matching hardware/kernel version duplicates findings. This issues are detailed in -> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1276705 |
This task depends upon
Closed by Dave Reisner (falconindy)
Saturday, 13 September 2014, 14:14 GMT
Reason for closing: Upstream
Additional comments about closing: Nothing for Arch to do here. See upstream discussion about udev timeouts and kernel regressions:
http://lists.freedesktop.org/archives/sy stemd-devel/2014-September/022923.html
Saturday, 13 September 2014, 14:14 GMT
Reason for closing: Upstream
Additional comments about closing: Nothing for Arch to do here. See upstream discussion about udev timeouts and kernel regressions:
http://lists.freedesktop.org/archives/sy stemd-devel/2014-September/022923.html
Additional info: This seems to be an issue with both Kernel 3.13+ and systemd.
Excerpts from https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1297248
(1) Currently finit_module() of mptsas kernel module does need more than
30 seconds to initialize LSI SAS1068E disk.
(2) Currently systemd-udevd unconditionally sends SIGKILL upon hardcoded
30 seconds timeout. As a result, finit_module() of mptsas kernel
module receives SIGKILL when waiting for error handler thread to be
started.
(3) Before commit 786235ee was applied, finit_module() receiving SIGKILL
was no problem because kthread_create() ignored SIGKILL when waiting
for error handler thread to be started. But after commit 786235ee was
applied, finit_module() receiving SIGKILL is a problem because
kthread_create() no longer ignores SIGKILL when waiting for error
handler thread to be started. As a result, finit_module() of mptsas
kernel module failed to initialize LSI SAS1068E disk, leading to
a boot failure.
Commit 786235ee was meant for helping OOM killer to terminate the victim
process immediately when the victim process is unable to be terminated
due to waiting for kthreadd process to complete memory allocation.
Kernel developers think that it is a systemd's bug because any thread
who received SIGKILL has a right to terminate immediately. Therefore,
reverting commit 786235ee is not acceptable for kernel developers.
On the other hand, systemd developers think that it is a kernel's bug
because finit_module() should return within 30 seconds. Therefore,
changing to longer timeout is not acceptable for systemd developers.
Since there was no time to wait for systemd to allow longer timeout,
Bug #1276705 used a SAUCE patch that allows kthread_create() to ignore
SIGKILL up to 10 seconds. We used a SAUCE patch for Ubuntu 14.04, but
we don't want to carry this SAUCE patch forever.
PKGBUILD (15.5 KiB)
Why isn't this patch upstream?