FS#9698 - Random Freezes with recent [testing] packages

Attached to Project: Arch Linux
Opened by Nick Roberts (RobbeR49) - Wednesday, 27 February 2008, 01:22 GMT
Last edited by Aaron Griffin (phrakture) - Friday, 25 April 2008, 15:00 GMT
Task Type Bug Report
Category Packages: Testing
Status Closed
Assigned To Travis Willard (Cerebral)
Architecture x86_64
Severity High
Priority Normal
Reported Version 2007.08-2
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

Description:

I began getting random freezes a few days ago after updating some packages from [testing]. They've been occurring about 2-3 times a day with no discernible pattern to when they happen. No user input or escaping is possible after the freeze so it requires a hard reset.


Additional info:
* package version(s)
Here are most the major updates I did:
coreutils (6.10-1 -> 6.10-2)
dhcpcd (3.2.0-1 -> 3.2.1-1)
perl (5.8.8-9 -> 5.10.0-2)
libevent (1.3d-2 -> 1.3e-1)
openbox (3.4.6.1-1 -> 3.4.6.1-2)
xorg-server (1.4.0.90-5 -> 1.4.0.90-7)
catalyst & catalyst-utils (8.01-1 -> 8.02-1)
kernel26 (2.6.23.14-1 -> 2.6.24.2-1)

* config and/or log files etc.

Here's my kernel.log from where the trouble seems to start, the errors seem to be just about the same every time, a bunch of general protection faults, with the fglrx errors at the end or near the end. (full log attached).

Feb 22 03:39:40 home general protection fault: 0000 [1] PREEMPT SMP
Feb 22 03:39:40 home CPU 0
Feb 22 03:39:40 home Modules linked in: usb_storage ipv6 fglrx(P) nls_cp437 vfat fat reiserfs ppdev lp ohci1394 ieee1394 firewire_ohci firewire_core crc_itu_t emu10k1_gp gameport parport_pc parport rtc_cmos rtc_core rtc_lib ppp_generic slhc pcspkr k8temp shpchp pci_hotplug forcedeth usbhid hid ff_memless i2c_nforce2 i2c_core sg evdev snd_seq_oss snd_seq_midi_event snd_seq thermal processor fan button snd_pcm_oss snd_mixer_oss battery ac snd_emu10k1 snd_rawmidi snd_ac97_codec ac97_bus snd_pcm snd_seq_device snd_timer snd_page_alloc snd_util_mem snd_hwdep snd soundcore ext3 jbd mbcache sd_mod ehci_hcd sr_mod ohci_hcd cdrom pata_acpi usbcore sata_nv ata_generic pata_amd libata
Feb 22 03:39:40 home Pid: 6075, comm: hald-addon-stor Tainted: P 2.6.24-ARCH #1
Feb 22 03:39:40 home RIP: 0010:[<ffffffff80297f43>] [<ffffffff80297f43>] kmem_cache_alloc+0x43/0xa0
Feb 22 03:39:40 home RSP: 0018:ffff81001f057aa8 EFLAGS: 00010086
Feb 22 03:39:40 home RAX: 0000000000000000 RBX: ff7f810000dc49c0 RCX: ffffffff803f72d7
Feb 22 03:39:40 home RDX: 0000000000000003 RSI: 0000000000008010 RDI: ffffffff805e0760
Feb 22 03:39:40 home RBP: 0000000000000292 R08: 0000000000000000 R09: ffff81001f057b48
Feb 22 03:39:40 home R10: 2222222222222222 R11: 2222222222222222 R12: ffff810001707560
Feb 22 03:39:40 home R13: 0000000000008010 R14: 0000000000000000 R15: 0000000000000003
Feb 22 03:39:40 home FS: 00002affc3193110(0000) GS:ffffffff805d4000(0000) knlGS:00000000f7e526c0
Feb 22 03:39:40 home CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Feb 22 03:39:40 home CR2: 00007fffee42bef0 CR3: 000000001dc80000 CR4: 00000000000006e0
Feb 22 03:39:40 home DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Feb 22 03:39:40 home DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Feb 22 03:39:40 home Process hald-addon-stor (pid: 6075, threadinfo ffff81001f056000, task ffff81001ee04000)
Feb 22 03:39:40 home Stack: 0000000100303def 0000000004000000 ffff81001ed02540 ffff81001f057b48
Feb 22 03:39:40 home 0000000000000000 ffffffff803f72d7 ffffffff80633a00 0000000000000257
Feb 22 03:39:40 home 0000000000000202 ffff81001f057b58 ffff81001ec59000 ffff81001ec59000
Feb 22 03:39:40 home Call Trace:
Feb 22 03:39:40 home [<ffffffff803f72d7>] scsi_execute_req+0x57/0x100
Feb 22 03:39:40 home [<ffffffff803f73c9>] scsi_test_unit_ready+0x49/0xa0
Feb 22 03:39:40 home [<ffffffff802aaa31>] free_poll_entry+0x11/0x20
Feb 22 03:39:40 home [<ffffffff802aaa70>] poll_freewait+0x30/0x90
Feb 22 03:39:40 home [<ffffffff88076173>] :sr_mod:sr_media_change+0x73/0x270
Feb 22 03:39:40 home [<ffffffff802aadb8>] do_sys_poll+0x2e8/0x380
Feb 22 03:39:40 home [<ffffffff880620a8>] :cdrom:media_changed+0x68/0xa0
Feb 22 03:39:40 home [<ffffffff802c6df7>] check_disk_change+0x27/0x90
Feb 22 03:39:40 home [<ffffffff88066712>] :cdrom:cdrom_open+0x182/0xb50
Feb 22 03:39:40 home [<ffffffff802af401>] dput+0x21/0x140
Feb 22 03:39:40 home [<ffffffff802a71a6>] __link_path_walk+0xb16/0xeb0
Feb 22 03:39:40 home [<ffffffff802b4e2f>] mntput_no_expire+0x1f/0x90
Feb 22 03:39:40 home [<ffffffff802a75c1>] link_path_walk+0x81/0xf0
Feb 22 03:39:40 home [<ffffffff804bce6e>] __mutex_lock_slowpath+0x15e/0x2c0
Feb 22 03:39:40 home [<ffffffff804bce6e>] __mutex_lock_slowpath+0x15e/0x2c0
Feb 22 03:39:40 home [<ffffffff804bce6e>] __mutex_lock_slowpath+0x15e/0x2c0
Feb 22 03:39:40 home [<ffffffff880764d3>] :sr_mod:sr_block_open+0xb3/0xd0
Feb 22 03:39:40 home [<ffffffff802c773d>] do_open+0xad/0x330
Feb 22 03:39:40 home [<ffffffff802a5f93>] may_open+0xc3/0x270
Feb 22 03:39:40 home [<ffffffff802c7c30>] blkdev_open+0x0/0x90
Feb 22 03:39:40 home [<ffffffff802c7c6c>] blkdev_open+0x3c/0x90
Feb 22 03:39:40 home [<ffffffff8029a4eb>] __dentry_open+0x12b/0x270
Feb 22 03:39:40 home [<ffffffff8029a73a>] do_filp_open+0x3a/0x50
Feb 22 03:39:40 home [<ffffffff802b18c2>] iput+0x42/0x80
Feb 22 03:39:40 home [<ffffffff8029a389>] get_unused_fd_flags+0x109/0x130
Feb 22 03:39:40 home [<ffffffff8029a7aa>] do_sys_open+0x5a/0xf0
Feb 22 03:39:40 home [<ffffffff8020c49e>] system_call+0x7e/0x83
Feb 22 03:39:40 home
Feb 22 03:39:40 home
Feb 22 03:39:40 home Code: 48 8b 04 c3 49 89 04 24 55 9d 66 45 85 ed 79 14 48 85 db 74
Feb 22 03:39:40 home RIP [<ffffffff80297f43>] kmem_cache_alloc+0x43/0xa0
Feb 22 03:39:40 home RSP <ffff81001f057aa8>
Feb 22 03:39:40 home ---[ end trace 3c27c650313471eb ]---
Feb 22 03:39:55 home general protection fault: 0000 [2] PREEMPT SMP
Feb 22 03:39:55 home CPU 0
Feb 22 03:39:55 home Modules linked in: usb_storage ipv6 fglrx(P) nls_cp437 vfat fat reiserfs ppdev lp ohci1394 ieee1394 firewire_ohci firewire_core crc_itu_t emu10k1_gp gameport parport_pc parport rtc_cmos rtc_core rtc_lib ppp_generic slhc pcspkr k8temp shpchp pci_hotplug forcedeth usbhid hid ff_memless i2c_nforce2 i2c_core sg evdev snd_seq_oss snd_seq_midi_event snd_seq thermal processor fan button snd_pcm_oss snd_mixer_oss battery ac snd_emu10k1 snd_rawmidi snd_ac97_codec ac97_bus snd_pcm snd_seq_device snd_timer snd_page_alloc snd_util_mem snd_hwdep snd soundcore ext3 jbd mbcache sd_mod ehci_hcd sr_mod ohci_hcd cdrom pata_acpi usbcore sata_nv ata_generic pata_amd libata
Feb 22 03:39:55 home Pid: 6184, comm: firefox-bin Tainted: P D 2.6.24-ARCH #1
Feb 22 03:39:55 home RIP: 0010:[<ffffffff80297f43>] [<ffffffff80297f43>] kmem_cache_alloc+0x43/0xa0
Feb 22 03:39:55 home RSP: 0018:ffff8100187bdb28 EFLAGS: 00010086
Feb 22 03:39:55 home RAX: 0000000000000000 RBX: ff7f810000dc49c0 RCX: ffffffff804411fe
Feb 22 03:39:55 home RDX: ffff81001dc65600 RSI: 00000000000080d0 RDI: ffffffff805e0760
Feb 22 03:39:55 home RBP: 0000000000000296 R08: 0000000000000000 R09: 2222222222222222
Feb 22 03:39:55 home R10: 2222222222222222 R11: 2222222222222222 R12: ffff810001707560
Feb 22 03:39:55 home R13: 00000000000080d0 R14: ffff81001dc65600 R15: ffff81001f817400
Feb 22 03:39:55 home FS: 0000000040800950(0063) GS:ffffffff805d4000(0000) knlGS:00000000f7e526c0
Feb 22 03:39:55 home CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Feb 22 03:39:55 home CR2: 00007fffee42bef0 CR3: 0000000018787000 CR4: 00000000000006e0
Feb 22 03:39:55 home DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Feb 22 03:39:55 home DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Feb 22 03:39:55 home Process firefox-bin (pid: 6184, threadinfo ffff8100187bc000, task ffff810018744000)
Feb 22 03:39:55 home Stack: ffff8100187bdbd8 ffff81001edba500 ffffffff8042fc60 ffff81001edba500
Feb 22 03:39:55 home 0000000000000000 ffffffff804411fe ffffffff805cb120 ffffffff8042fc60
Feb 22 03:39:55 home ffff81001dc65600 ffff81001edba500 0000000000000000 0000000000000006
Feb 22 03:39:55 home Call Trace:
Feb 22 03:39:55 home [<ffffffff8042fc60>] rtnl_dump_all+0x0/0xe0
Feb 22 03:39:55 home [<ffffffff804411fe>] netlink_dump_start+0x2e/0x180
Feb 22 03:39:55 home [<ffffffff8042fc60>] rtnl_dump_all+0x0/0xe0
Feb 22 03:39:55 home [<ffffffff80431606>] rtnetlink_rcv_msg+0xe6/0x240
Feb 22 03:39:55 home [<ffffffff80431520>] rtnetlink_rcv_msg+0x0/0x240
Feb 22 03:39:55 home [<ffffffff8043fe54>] netlink_rcv_skb+0x74/0xa0
Feb 22 03:39:55 home [<ffffffff80431518>] rtnetlink_rcv+0x18/0x20
Feb 22 03:39:55 home [<ffffffff8043fbb3>] netlink_unicast+0x243/0x270
Feb 22 03:39:55 home [<ffffffff804404c6>] netlink_sendmsg+0x246/0x310
Feb 22 03:39:55 home [<ffffffff80418f9b>] sock_sendmsg+0xcb/0x100
Feb 22 03:39:55 home [<ffffffff80252ae0>] autoremove_wake_function+0x0/0x30
Feb 22 03:39:55 home [<ffffffff8043f211>] netlink_insert+0xa1/0x160
Feb 22 03:39:55 home [<ffffffff80417f0e>] move_addr_to_kernel+0x2e/0x40
Feb 22 03:39:55 home [<ffffffff80419476>] sys_sendto+0x146/0x1b0
Feb 22 03:39:55 home [<ffffffff80418681>] sockfd_lookup_light+0x41/0x80
Feb 22 03:39:55 home [<ffffffff80419d7d>] move_addr_to_user+0x5d/0x70
Feb 22 03:39:55 home [<ffffffff8041a419>] sys_getsockname+0xd9/0xe0
Feb 22 03:39:55 home [<ffffffff802b01db>] d_instantiate+0x5b/0x70
Feb 22 03:39:55 home [<ffffffff8020c49e>] system_call+0x7e/0x83
Feb 22 03:39:55 home
Feb 22 03:39:55 home
Feb 22 03:39:55 home Code: 48 8b 04 c3 49 89 04 24 55 9d 66 45 85 ed 79 14 48 85 db 74
Feb 22 03:39:55 home RIP [<ffffffff80297f43>] kmem_cache_alloc+0x43/0xa0
Feb 22 03:39:55 home RSP <ffff8100187bdb28>
Feb 22 03:39:55 home ---[ end trace 3c27c650313471eb ]---
Feb 22 03:41:01 home general protection fault: 0000 [3] PREEMPT SMP
Feb 22 03:41:01 home CPU 0
Feb 22 03:41:01 home Modules linked in: usb_storage ipv6 fglrx(P) nls_cp437 vfat fat reiserfs ppdev lp ohci1394 ieee1394 firewire_ohci firewire_core crc_itu_t emu10k1_gp gameport parport_pc parport rtc_cmos rtc_core rtc_lib ppp_generic slhc pcspkr k8temp shpchp pci_hotplug forcedeth usbhid hid ff_memless i2c_nforce2 i2c_core sg evdev snd_seq_oss snd_seq_midi_event snd_seq thermal processor fan button snd_pcm_oss snd_mixer_oss battery ac snd_emu10k1 snd_rawmidi snd_ac97_codec ac97_bus snd_pcm snd_seq_device snd_timer snd_page_alloc snd_util_mem snd_hwdep snd soundcore ext3 jbd mbcache sd_mod ehci_hcd sr_mod ohci_hcd cdrom pata_acpi usbcore sata_nv ata_generic pata_amd libata
Feb 22 03:41:01 home Pid: 6138, comm: X Tainted: P D 2.6.24-ARCH #1
Feb 22 03:41:01 home RIP: 0010:[<ffffffff80298e22>] [<ffffffff80298e22>] __kmalloc+0x62/0x100
Feb 22 03:41:01 home RSP: 0018:ffff81001c823da8 EFLAGS: 00210086
Feb 22 03:41:01 home RAX: 0000000000000000 RBX: ff7f810000dc49c0 RCX: ffffffff882b5016
Feb 22 03:41:01 home RDX: 0000000000000020 RSI: 00000000000000d0 RDI: ffffffff805e0760
Feb 22 03:41:01 home RBP: 0000000000200286 R08: 0000000000000001 R09: 00007ffffd4059e0
Feb 22 03:41:01 home R10: 0001000000000000 R11: ffffffff803593e0 R12: ffff810001707560
Feb 22 03:41:01 home R13: 00000000000000d0 R14: ffff81001ee14d80 R15: ffffffff88450e68
Feb 22 03:41:01 home FS: 00002acfaf762730(0000) GS:ffffffff805d4000(0000) knlGS:00000000f7e526c0
Feb 22 03:41:01 home CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Feb 22 03:41:01 home CR2: 0000000000657f88 CR3: 000000001c82a000 CR4: 00000000000006e0
Feb 22 03:41:01 home DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Feb 22 03:41:01 home DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Feb 22 03:41:01 home Process X (pid: 6138, threadinfo ffff81001c822000, task ffff81001f3de000)
Feb 22 03:41:01 home Stack: ffff81001c823da8 0000000000000002 ffffffff88450e20 0000000000000050
Feb 22 03:41:01 home ffffffff88450e20 ffffffff882b5016 0000000000000020 ffff81001c823e38
Feb 22 03:41:01 home ffffffff88450e20 ffffffff88450e20 0000000000000000 ffffffff882ca2ea
Feb 22 03:41:01 home Call Trace:
Feb 22 03:41:01 home [<ffffffff882b5016>] :fglrx:drm_alloc+0x56/0x120
Feb 22 03:41:01 home [<ffffffff882ca2ea>] :fglrx:firegl_cmmqs_CWDDE_32+0x7a/0x310
Feb 22 03:41:01 home [<ffffffff882c9577>] :fglrx:firegl_cmmqs_CWDDE32+0x67/0xf0
Feb 22 03:41:01 home [<ffffffff882c9510>] :fglrx:firegl_cmmqs_CWDDE32+0x0/0xf0
Feb 22 03:41:01 home [<ffffffff882bd226>] :fglrx:firegl_ioctl+0x1b6/0x230
Feb 22 03:41:01 home [<ffffffff802aa0ed>] do_ioctl+0x7d/0xa0
Feb 22 03:41:01 home [<ffffffff802aa330>] vfs_ioctl+0x220/0x2c0
Feb 22 03:41:01 home [<ffffffff802aa461>] sys_ioctl+0x91/0xb0
Feb 22 03:41:01 home [<ffffffff8020c49e>] system_call+0x7e/0x83
Feb 22 03:41:01 home
Feb 22 03:41:01 home
Feb 22 03:41:01 home Code: 48 8b 04 c3 49 89 04 24 55 9d 66 45 85 ed 78 1e 48 89 d8 48
Feb 22 03:41:01 home RIP [<ffffffff80298e22>] __kmalloc+0x62/0x100
Feb 22 03:41:01 home RSP <ffff81001c823da8>
Feb 22 03:41:01 home ---[ end trace 3c27c650313471eb ]---
Feb 22 03:41:01 home [fglrx:firegl_release] *ERROR* device busy: 1 0
Feb 22 03:41:01 home [fglrx] release failed with code -EBUSY


Steps to reproduce:
wait.
   kernel.log (205.8 KiB)
This task depends upon

Closed by  Aaron Griffin (phrakture)
Friday, 25 April 2008, 15:00 GMT
Reason for closing:  Fixed
Additional comments about closing:  Fixed in 2.6.25
Comment by Jan de Groot (JGC) - Wednesday, 27 February 2008, 08:06 GMT
Looks like a bug in fglrx that corrupts your memory. Other option is bad memory, though I don't think a new version of fglrx could trigger problems with that.
Comment by Nick Roberts (RobbeR49) - Wednesday, 27 February 2008, 12:20 GMT
I ran memtest86+ and didn't get any errors, so I tried downgrading to catalyst 8.01-1 (which I never had a problem with), but it froze on me as well. Maybe it's a combination of catalyst and the 2.6.24.2 kernel?
Comment by Nick Roberts (RobbeR49) - Monday, 10 March 2008, 13:02 GMT
It looks like catalyst 8.3 fixed this issue for me, I haven't had a freeze for a couple days now.
Comment by Nick Roberts (RobbeR49) - Thursday, 27 March 2008, 15:42 GMT
Freezes began occurring again.
Comment by Aaron Griffin (phrakture) - Thursday, 27 March 2008, 15:43 GMT
As it seems like this is a catalyst bug, assigning to Travis
Comment by Nick Roberts (RobbeR49) - Thursday, 27 March 2008, 18:22 GMT
I tried the radeon driver and I'm still having problems, so it doesn't look like catalyst is the culprit. I'm still getting the sequence of general protection faults in my logs, but the radeon drivers don't cause a hard freeze like the catalyst ones, instead I'll lose the ability to start any new programs from my menu (under openbox). I can exit openbox to a tty, but X still refuses to quit. I also it seems like the pattern of general protection faults seems to follow a specific order for the processes as far as I can tell:

hald-addon-stor --> all running GUI apps --> X

I think the only thing that doesn't seem to generate a general protection fault is the WM.

any ideas?
Comment by Nick Roberts (RobbeR49) - Thursday, 27 March 2008, 21:19 GMT
I tried the radeon driver and I'm still having problems, so it doesn't look like catalyst is the culprit. I'm still getting the sequence of general protection faults in my logs, but the radeon drivers don't cause a hard freeze like the catalyst ones, instead I'll lose the ability to start any new programs from my menu (under openbox). I can exit openbox to a tty, but X still refuses to quit. I also it seems like the pattern of general protection faults seems to follow a specific order for the processes as far as I can tell:

hald-addon-stor --> all running GUI apps --> X

I think the only thing that doesn't seem to generate a general protection fault is the WM.

any ideas?
Comment by Jordy van Wolferen (jordz) - Saturday, 05 April 2008, 11:07 GMT
I also run testing and my gnome 2.22 also keeps crashing. I see you are running firefox, I have the same problem that my system locks when I'm using firefox. I'm now testing with epiphany, no lockups so far. I don't know if there is already a bug report about firefox, that it doesn't close correctly and you need to kill firefox-bin. Maybe this can help you?
Comment by Nick Roberts (RobbeR49) - Saturday, 05 April 2008, 13:18 GMT
Well, I don't know if the title really applies much anymore - all of those packages are long since out of testing anyway. I don't think it's Firefox either, my freezes will happen whether I am running Firefox or not.

Now that I've had a few more freezes I think that it's the hal daemon, it seems like the only constant in all of this as it always starts the general protection faults. The processes after it don't always follow the order I put above like I thought, but hal-addon-stor always is the first one.
Comment by Jordy van Wolferen (jordz) - Saturday, 05 April 2008, 13:36 GMT
Ow ok, i got the problem that firefox keeps running even after i closed it. So I also didn't expect firefox be the problem, because I didn't use it atm. But my crashes are gone when im using epiphany as browser.
Comment by Nick Roberts (RobbeR49) - Friday, 18 April 2008, 01:24 GMT
The very latest freeze didn't have hal-addon-stor as the first process with the general protection fault, openbox is the first one this time. So I'm guessing it's either a kernel bug that manifested itself after I moved to 2.6.24, or maybe it's still bad memory. I'm going to run memcheck86 again, see if anything pops up this time. I attached a truncated current kernel log, it has the last two rounds of protection faults on it.
Comment by Nick Roberts (RobbeR49) - Friday, 18 April 2008, 17:38 GMT
memcheck didn't turn up anything after running overnight, and I had another freeze this morning but without anything in my logs. I was getting blinking keyboard LED's, though, which has never happened before.
Comment by Nick Roberts (RobbeR49) - Friday, 25 April 2008, 11:15 GMT
I think 2.6.25 may have finally fixed the freezes, I haven't seen one going on 5 days now. I'm going to request closure and hope I don't have to re-open again.

Loading...