FS#18014 - [oss] crashes repeatedly

Attached to Project: Community Packages
Opened by Rene (hit) - Tuesday, 26 January 2010, 13:45 GMT
Last edited by Dan Griffiths (Ghost1227) - Monday, 01 March 2010, 20:35 GMT
Task Type Bug Report
Category Upstream Bugs
Status Closed
Assigned To Dan Griffiths (Ghost1227)
Architecture All
Severity Critical
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 1
Private No

Details

Description:
OSS crashes repeatedly for some reason.

Additional info:
* dmesg

irq 17: nobody cared (try booting with the "irqpoll" option)
Pid: 0, comm: swapper Not tainted 2.6.32-ARCH #1
Call Trace:
[<c108a134>] ? __report_bad_irq+0x24/0x90
[<c108a2f0>] ? note_interrupt+0x150/0x190
[<c108aa4b>] ? handle_fasteoi_irq+0xab/0xd0
[<c10064e5>] ? handle_irq+0x15/0x30
[<c1005a57>] ? do_IRQ+0x47/0xc0
[<c1047002>] ? irq_exit+0x52/0x70
[<c1005a60>] ? do_IRQ+0x50/0xc0
[<c1035463>] ? select_nohz_load_balancer+0x53/0x140
[<c10040d0>] ? common_interrupt+0x30/0x38
[<c100b00d>] ? mwait_idle+0x4d/0xc0
[<c1002904>] ? cpu_idle+0x94/0xe0
[<c13e285d>] ? start_kernel+0x335/0x33a
[<c13e2366>] ? unknown_bootoption+0x0/0x190
handlers:
[<f5f7eb00>] (osscore_intr+0x0/0x60 [osscore])
Disabling IRQ #17
osscore: Output timed out (sync) on audio engine 1
198831 fifo errors were detected

Restarted OSS and another one kicks in dmesg:

osscore: Output timed out on audio engine 1/'Intel ICH5 (24D5) (vmix)' (count=42274816)
irq 17: nobody cared (try booting with the "irqpoll" option)
Pid: 13541, comm: deluged Not tainted 2.6.32-ARCH #1
Call Trace:
[<c108a134>] ? __report_bad_irq+0x24/0x90
[<f63516a8>] ? ichintr+0x2e8/0x8f0 [oss_ich]
[<c108a2f0>] ? note_interrupt+0x150/0x190
[<c108aa4b>] ? handle_fasteoi_irq+0xab/0xd0
[<c10064e5>] ? handle_irq+0x15/0x30
[<c1005a57>] ? do_IRQ+0x47/0xc0
[<f6295b28>] ? osscore_intr+0x28/0x60 [osscore]
[<c10040d0>] ? common_interrupt+0x30/0x38
[<c1221d60>] ? __copy_skb_header+0x0/0x170
[<c1221eef>] ? __skb_clone+0x1f/0xd0
[<c1267869>] ? tcp_transmit_skb+0x69/0x6d0
[<c10040d0>] ? common_interrupt+0x30/0x38
[<c1269d40>] ? tcp_write_xmit+0x190/0x980
[<c126a691>] ? tcp_send_fin+0x91/0x180
[<c126a59e>] ? __tcp_push_pending_frames+0x2e/0x90
[<c125bcff>] ? tcp_close+0x32f/0x410
[<c127b2e3>] ? inet_release+0x33/0x60
[<c121d2aa>] ? sock_release+0x1a/0x80
[<c121d31f>] ? sock_close+0xf/0x30
[<c10e4e67>] ? __fput+0xc7/0x1d0
[<c10e1847>] ? filp_close+0x47/0x80
[<c10e18f0>] ? sys_close+0x70/0xc0
[<c10039f3>] ? sysenter_do_call+0x12/0x28
handlers:
[<f6295b00>] (osscore_intr+0x0/0x60 [osscore])
Disabling IRQ #17
157819 fifo errors were detected

After these errors sound start stuttering and again, restarting OSS fixes it for a couple of minutes (or less).

* package version(s)
firefox 3.6-2
jre 6u17-1
oss 4.2_2002-2
kernel26 2.6.32.6-1

Steps to reproduce:
1. Play sounds with OSS

Yes, [testing] is enabled, but only because this also happened before, so I enabled testing to see if it's fixed.

Booting with "irqpoll" didn't help.

Is there any other information I can provide to help investigation?
This task depends upon

Closed by  Dan Griffiths (Ghost1227)
Monday, 01 March 2010, 20:35 GMT
Reason for closing:  Works for me
Additional comments about closing:  OP requested close.
Comment by Ionut Biru (wonder) - Tuesday, 26 January 2010, 14:26 GMT
did you tried to rebuild it using abs? i remember it was a problem with external modules and kernel >2.6.32.3
Comment by Paulo Matias (thotypous) - Tuesday, 26 January 2010, 14:51 GMT
Good idea, please try building from abs like wonder said to see if it fixes the issue. Here with my card I haven't had any issues, but it would be possible to vary from card to card.

Also, which was the last kernel version where everything used to work correctly?
Comment by Rene (hit) - Tuesday, 26 January 2010, 16:26 GMT
It might have been 2.6.32.3, since this OSS issue started a couple of kernel updates ago.
Rebuilt OSS now, along with alsa-oss and alsa-plugins, and problem seems to be only with Java apps in browser now (games and so). I can hear the sounds, but the browser hangs heavily and system mem usage is massive. Also dmesg is filled with repeating messages which I attached now.
   dmesg (3.7 KiB)
Comment by Rene (hit) - Thursday, 28 January 2010, 19:29 GMT
Nevermind about Java, it happens just anytime.
Comment by Paulo Matias (thotypous) - Saturday, 30 January 2010, 18:10 GMT
After some looking at the source code, I can't see any other way for the "osscore: Failed to create a vmix engine, error=-16" messages to show besides this in vmix_core.c:

if (n + 1 >= MAX_CLIENTS) /* Cannot create more client engines */
return OSS_EBUSY;

As 16 is EBUSY, and that's the only place where it returns OSS_EBUSY that could result in such message.

Because of this, the only way I can see this message would show up is that more than MAX_CLIENTS (in our build, 9) instances of /dev/dsp are being open at the same time. If this is the case, you should see a lot processes in the per-application mixer controls at ossxmix when this errors occurs.

In this case it would be a good measure to see which applications are that and looking if some of them are opening /dev/dsp crazily, causing this.

About the "page allocation failure", I can't see why it happens, nor if it could be a cause for the issues that happens after that (including the vmix engine creation failure). I tried looking for cesium at IRC but couldn't find him. You may want to post this bug at the OSS forum (http://www.opensound.com/forum/) to see if he or someone else experienced on OSS can help debugging that.

I have no other ideas by now :(
Comment by Andrea Scarpino (BaSh) - Thursday, 04 February 2010, 20:58 GMT
assigned to wonder because Paulo left and this package is orphan so
Comment by Ionut Biru (wonder) - Thursday, 04 February 2010, 20:59 GMT
i don't use it, so i can't maintain or fix this.
Comment by Rene (hit) - Thursday, 04 February 2010, 21:35 GMT Comment by Dan Griffiths (Ghost1227) - Friday, 05 February 2010, 01:39 GMT
Assigned to me because I do use it and am taking over as maintainer :P

However... I can't replicate this issue (and I have been using OSS for a long time now...)
Comment by Rene (hit) - Friday, 12 February 2010, 12:09 GMT
Fresh install and same sh*t!

[hit@sahver ~]$ dmesg
...
irq 17: nobody cared (try booting with the "irqpoll" option)
Pid: 0, comm: swapper Not tainted 2.6.32-ARCH #1
Call Trace:
[<c108a2b4>] ? __report_bad_irq+0x24/0x90
[<c108a470>] ? note_interrupt+0x150/0x190
[<c108abcb>] ? handle_fasteoi_irq+0xab/0xd0
[<c10064e5>] ? handle_irq+0x15/0x30
[<c1005a57>] ? do_IRQ+0x47/0xc0
[<c10354f3>] ? select_nohz_load_balancer+0x53/0x140
[<c10040d0>] ? common_interrupt+0x30/0x38
[<c100b01d>] ? mwait_idle+0x4d/0xc0
[<c1002904>] ? cpu_idle+0x94/0xe0
[<c13e485d>] ? start_kernel+0x335/0x33a
[<c13e4366>] ? unknown_bootoption+0x0/0x190
handlers:
[<f1382b00>] (osscore_intr+0x0/0x60 [osscore])
Disabling IRQ #17
osscore: Output timed out on audio engine 1/'Intel ICH5 (24D5) (vmix)' (count=7659520)
109278 fifo errors were detected
[hit@sahver ~]$ sudo /etc/rc.d/oss restart
:: Stopping Open Sound System [DONE]
:: Starting Open Sound System [DONE]
[hit@sahver ~]$

After restart everything is OK again. It just drives me crazy and makes me want to make a big fire and thorw everything into.
Comment by Dan Griffiths (Ghost1227) - Wednesday, 24 February 2010, 06:51 GMT
Any news?
Comment by Rene (hit) - Monday, 01 March 2010, 11:36 GMT
After messing around with different bleeding-edge packages[1] and changing bios settings[2] the oss crashes seem gone.
So I guess we can close it...

[1] udisks-git, gvfs-git, etc
[2] IRQ addresses set to manual instead of automatic

Loading...