FS#7125 - kernel26suspend-2.6.21-1 Hangs on suspend2 in Atomic Copy

Attached to Project: Arch Linux
Opened by Michal Witkowski (Neuro) - Friday, 11 May 2007, 12:48 GMT
Last edited by Isenmann Daniel (ise) - Tuesday, 16 October 2007, 16:33 GMT
Task Type Bug Report
Category Kernel
Status Closed
Assigned To Isenmann Daniel (ise)
Architecture i686
Severity Medium
Priority Normal
Reported Version 0.8 Voodoo
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 1
Private No

Details

Hi,
Since beyond is no longer maintained, I decided to switch to suspend2 for my laptop (Asus A6JC). Suspending worked fine with kernel26beyond-2.6.19 and all previous ones. After switching to kernel26suspend2, suspending stopped working.

I tried numerous things, switching off userui, both PowerdownMethods (4 and 5, 4 worked for beyond), setting Reboot to yes, but the system seems to hang.

When suspend2 worked in beyond, during the atomic copy phase, the HDD led was constantly on, writing to disk like mad. This time, everything shuts down (including the LCD) but the system is still running.

The logs show nothing:
maj 11 14:35:17.68 hibernate: [93] Executing DiskCacheDisable ...
maj 11 14:35:17.69 hibernate: Disabling disk cache on /dev/sda
maj 11 14:35:17.70 hibernate: [95] Executing XHacksSuspendHook2 ...
maj 11 14:35:18.09 xhacks: changing console from 2 to 15
maj 11 14:35:18.25 hibernate: [95] Executing XStatusProgress ...
maj 11 14:35:18.26 hibernate: [97] Executing ChangeToSwsuspVT ...
maj 11 14:35:18.27 hibernate: [98] Executing CheckRunlevel ...
maj 11 14:35:18.28 hibernate: [98] Executing FullSpeedCPUSuspend ...
maj 11 14:35:18.30 Switched to performance, with min freq at 1660000
maj 11 14:35:18.34 Switched to performance, with min freq at 1660000
maj 11 14:35:18.35 hibernate: [98] Executing Swsusp2ConfigSet ...
maj 11 14:35:18.37 hibernate: [98] Executing XStatusProgressKill ...
maj 11 14:35:18.38 hibernate: [99] Executing DoSwsusp2 ...
maj 11 14:35:18.39 hibernate: Activating suspend ...

messages.log:
May 11 14:33:18 deck ACPI: PCI interrupt for device 0000:03:00.0 disabled
May 11 14:33:18 deck ehci_hcd 0000:00:1d.7: remove, state 1
May 11 14:33:18 deck usb usb1: USB disconnect, address 1
May 11 14:33:18 deck usb 1-7: USB disconnect, address 4
May 11 14:33:18 deck ehci_hcd 0000:00:1d.7: USB bus 1 deregistered
May 11 14:33:18 deck ACPI: PCI interrupt for device 0000:00:1d.7 disabled
May 11 14:33:18 deck uhci_hcd 0000:00:1d.3: remove, state 1
May 11 14:33:18 deck usb usb5: USB disconnect, address 1
May 11 14:33:18 deck uhci_hcd 0000:00:1d.3: USB bus 5 deregistered
May 11 14:33:18 deck ACPI: PCI interrupt for device 0000:00:1d.3 disabled
May 11 14:33:18 deck uhci_hcd 0000:00:1d.2: remove, state 1
May 11 14:33:18 deck usb usb4: USB disconnect, address 1
May 11 14:33:18 deck usb 4-2: USB disconnect, address 2
May 11 14:33:18 deck uhci_hcd 0000:00:1d.2: USB bus 4 deregistered
May 11 14:33:18 deck ACPI: PCI interrupt for device 0000:00:1d.2 disabled
May 11 14:33:18 deck uhci_hcd 0000:00:1d.1: remove, state 1
May 11 14:33:18 deck usb usb3: USB disconnect, address 1
May 11 14:33:18 deck usb 3-1: USB disconnect, address 2
May 11 14:33:18 deck uhci_hcd 0000:00:1d.1: USB bus 3 deregistered
May 11 14:33:18 deck ACPI: PCI interrupt for device 0000:00:1d.1 disabled
May 11 14:33:18 deck uhci_hcd 0000:00:1d.0: remove, state 1
May 11 14:33:18 deck usb usb2: USB disconnect, address 1
May 11 14:33:18 deck uhci_hcd 0000:00:1d.0: USB bus 2 deregistered
May 11 14:33:18 deck ACPI: PCI interrupt for device 0000:00:1d.0 disabled
May 11 14:33:18 deck ata1.00: configured for UDMA/100
May 11 14:33:19 deck ata1.01: configured for UDMA/33
May 11 14:33:19 deck ata1: EH complete
May 11 14:33:19 deck SCSI device sda: 156301488 512-byte hdwr sectors (80026 MB)
May 11 14:33:19 deck sda: Write Protect is off
May 11 14:33:19 deck SCSI device sda: write cache: disabled, read cache: enabled, doesn't support DPO or FUA
May 11 14:33:19 deck SCSI device sda: 156301488 512-byte hdwr sectors (80026 MB)
May 11 14:33:19 deck sda: Write Protect is off
May 11 14:33:19 deck SCSI device sda: write cache: disabled, read cache: enabled, doesn't support DPO or FUA
May 11 14:33:19 deck Suspend2: Initiating a software suspend cycle.
May 11 14:33:19 deck SMP alternatives: switching to UP code

Any ideas?
This task depends upon

Closed by  Isenmann Daniel (ise)
Tuesday, 16 October 2007, 16:33 GMT
Reason for closing:  Won't fix
Additional comments about closing:  kernel26suspend is no longer maintained in the official repository.
Comment by Michal Witkowski (Neuro) - Friday, 11 May 2007, 12:49 GMT
Forgot to add. I also tried noapic and nolapic, but neither of these helped. With beyond 2.6.19 it worked fine without these parameters.
Comment by Thomas Bächler (brain0) - Sunday, 13 May 2007, 02:22 GMT
The problem here is that your logs show everything until you started suspending. After that, the logs are only written to the hard drive after a completed cycle. In the meantime, the syslog process is frozen. I need as much information as possible on your hardware and used drivers, the more the better (ANY information may help, even if it seems unimportant at first). And All I can do is discuss the issue with nigel in #suspend2 on freenode and hope to find a solution. Suspend2 changed a lot between 2.2.9 on 2.6.19 and now. Maybe 2.6.21-2 fixes your bug. If it doesn't, any information you have will probably be helpful to nigel. You can talk directly to him on #suspend2 or report here and I will talk to him then.

That said, I am confident that with nigel's help, we can fix the issue - he has always been most helpful to me.
Comment by Michal Witkowski (Neuro) - Thursday, 17 May 2007, 21:44 GMT
Sorry for the delay, I couldn't afford messing with the kernel during the week.
Now, I tried 2.6.21-2, and it doesn't fix the bug at all. The problem seems to be really low-level. With 2.6.19-beyond, the procedure went something like:
(power led is on)
Doing atomic copy.
(screen blanks)
(hdd led lights up and the drive writes like mad)
(the hdd stops writing, short delay)
(the power led goes off with a bit of a speaker bang [like on power off])
Now, the procedure halts after screen blank, the hdd never comes up.

Now, I'm no expert, but nothing like this ever before came up in dmesg:
scsi0 : ata_piix
Device driver host0 lacks bus and class support for being resumed.
ata1.00: ATA-6: HTS541080G9AT00, MB4OA60A, max UDMA/100
ata1.00: 156301488 sectors, multi 16: LBA48
ata1.01: ATAPI, max UDMA/33
ata1.00: configured for UDMA/100
ata1.01: configured for UDMA/33
scsi1 : ata_piix
Device driver host1 lacks bus and class support for being resumed.
ata2: port disabled. ignoring.
Device driver target0:0:0 lacks bus and class support for being resumed.
scsi 0:0:0:0: Direct-Access ATA HTS541080G9AT00 MB4O PQ: 0 ANSI: 5
Device driver target0:0:1 lacks bus and class support for being resumed.
scsi 0:0:1:0: CD-ROM HL-DT-ST DVDRAM GMA-4082N HJ02 PQ: 0 ANSI: 5

These lack of support for being resumed messeges were not present in the 2.6.19 dmesg. I attach the whole dmesg and lcpci, as you may find it useful. If any other information is needed, please let me know, I'll provide it ASAP.
Comment by Thomas Bächler (brain0) - Friday, 18 May 2007, 16:09 GMT
I can't read it now, will do say later. Anyway, I use ata_piix on my laptop (though only for IDE drives) and resuming is fine.
Comment by Michal Witkowski (Neuro) - Friday, 18 May 2007, 18:25 GMT
Found the fix for the problem here:
http://bbs.archlinux.org/viewtopic.php?pid=250074
and here:
http://lkml.org/lkml/2007/5/16/31

Appending highres=off nohz=off to the kernel command line solves the problem. The kernel (2.6.21-2) now correctly suspends and resumes. However, none of these directives on their own does the job.

I know this is only a temporary fix, as nohz is an important feature, especially for laptop users like me ;) So, if any information is needed, please say so, I'll be happy to help.
Comment by Thomas Bächler (brain0) - Sunday, 20 May 2007, 12:14 GMT
There are some known problems with suspend2 and nohz/highres-timers. I'll see what I can find out. For now, just keep the workaround.

I'll keep this bugreport open until we know more.
Comment by Michal Witkowski (Neuro) - Saturday, 15 September 2007, 06:55 GMT
The same problem still exists in kernel26suspend2-2.6.22.6-1. My custom-built kernel also suppers when highres and nohz are turned on.

Any progress on fixing this?
Comment by Isenmann Daniel (ise) - Saturday, 15 September 2007, 12:45 GMT
No, I didn't found anything yet. I don't have the problem with the suspend2 kernel, I use it daily on my machine.
But I will see what I can find out.

Loading...