FS#48691 - [linux] alsaplayer causes Linux kernel errors

Attached to Project: Arch Linux
Opened by Mike Dowling (mdowling) - Friday, 25 March 2016, 15:32 GMT
Last edited by Doug Newgard (Scimmia) - Friday, 20 October 2017, 14:51 GMT
Task Type Bug Report
Category Kernel
Status Closed
Assigned To Tobias Powalowski (tpowa)
Architecture x86_64
Severity Critical
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

Description:
When an audio CD finishes, in graphics mode the system seizes without error messages. The hardware reset button must be used to reboot.

In text mode, a kernel error appears, Linux survives, but not very well.

For some reason (probably NoScript for Firefox) the attach button below is not working for me. I therefore include the text of my bug report here.

Linux moocow 4.4.5-1-ARCH #1 SMP PREEMPT Thu Mar 10 07:38:19 CET 2016
x86_64 GNU/Linux

> alsaplayer --version
alsaplayer 0.99.81


Since the error message appears to come from the kernel but caused by
alsaplayer, and since kernels should protect themselves from user-land
applications, I think this is critical.


Alsaplayer kills Linux when in graphics mode. When alsaplayer reaches
the end of the CD, all that can be done is to press the hardware reset
button.

I should perhaps say from the outset that my Arch Linux is unusual in
that I don't use a desktop like KDE and Gnome. I try to keep it
simple. When I start Linux, it enters text mode. I start X11 when
needed with "startx". I can return to text mode with CTRL-ALT-Fn,
where n is the tty number of the virtual console. This is a feature
that Xorg has disabled by default in recent years but I have turned back
on again.

So, boot; logon to some tty; startx; alsaplayer /dev/cdrom. Alsaplayer
then starts, but, a minor niggle perhaps, it always wants to start on
the second track of the CD. Click on the first track and start again.

Now alsaplayer plays the CD, as it is doing now as I write, even though
I have changed back to text mode to write this. Normally I would not
change back to text mode. In this case, when the CD has finished, the
system is quite dead. No error messages, just dead.

When, as now, I return to text mode, I do get errors on the active tty.
They look like this:

[ 6279.200530] BUG: unable to handle kernel NULL pointer dereference at 0000000000000002
[ 6279.200947] IP: [<ffffffffa0336662>] mmc_ioctl_cdrom_read_audio+0x212/0x3b0 [cdrom]
[ 6279.201564] PGD d017d067 PUD cb02c067 PMD 0
[ 6279.202317] Oops: 0000 [#1] PREEMPT SMP
[ 6279.203082] Modules linked in: nfnetlink_queue nfnetlink_log nfnetlink bluetooth nf_log_ipv4 nf_log_common xt_LOG nf_conntrack_ipv4 nf_defrag_ipv4 snd_hda_codec_hdmi xt_tcpudp cfg80211 rfkill xt_conntrack nf_conntrack iptable_filter intel_rapl iosf_mbi x86_pkg_temp_thermal intel_powerclamp coretemp kvm irqbypass crct10dif_pclmul crc32_pclmul crc32c_intel iTCO_wdt snd_hda_codec_realtek iTCO_vendor_support snd_hda_codec_generic ppdev snd_hda_intel snd_hda_codec aesni_intel aes_x86_64 lrw gf128mul glue_helper snd_hda_core snd_hwdep ablk_helper snd_pcm cryptd pcspkr serio_raw snd_timer r8169 mii snd input_leds uas mei_me led_class mousedev ie31200_edac soundcore mei edac_core i2c_i801 lpc_ich shpchp fan thermal
battery parport_pc fjes parport tpm_infineon tpm_tis evdev tpm mac_hid processor sch_fq_codel loop sg ip_tables x_tables ext4 crc16 mbcache jbd2 sr_mod cdrom sd_mod usb_storage hid_generic ata_generic pata_acpi usbhid hid atkbd libps2 ata_piix xhci_pci libata ehci_pci ehci_hcd xhci_hcd scsi_mod usbcore usb_common i8042 serio nouveau button video mxm_wmi wmi i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm
[ 6279.208090] CPU: 3 PID: 1940 Comm: alsaplayer Not tainted 4.4.5-1-ARCH #1
[ 6279.209022] Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./B75M-D3H, BIOS F15 10/23/2013
[ 6279.210011] task: ffff8800378a1b80 ti: ffff880213ca4000 task.ti: ffff880213ca4000
[ 6279.210985] RIP: 0010:[<ffffffffa0336662>] [<ffffffffa0336662>] mmc_ioctl_cdrom_read_audio+0x212/0x3b0 [cdrom]
[ 6279.211961] RSP: 0018:ffff880213ca7cc8 EFLAGS: 00010282
[ 6279.212939] RAX: 0000000000000000 RBX: 0000000000000004 RCX: ffff8800379347b8
[ 6279.213912] RDX: ffff880213ca7c20 RSI: 0000000000000206 RDI: ffff8800cb08cf00
[ 6279.214876] RBP: ffff880213ca7d40 R08: 000012d8610e850e R09: 0000000000000002
[ 6279.215839] R10: 0000000000000020 R11: 0000000000000000 R12: 00000000000024c0
[ 6279.216775] R13: ffff880212ae0000 R14: 0000000000048929 R15: ffff880205c7e2e0
[ 6279.217715] FS: 00007fd2e21f4700(0000) GS:ffff88021dd80000(0000) knlGS:0000000000000000
[ 6279.218642] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 6279.219576] CR2: 0000000000000002 CR3: 00000000d4fea000 CR4: 00000000001406e0
[ 6279.220513] Stack:
[ 6279.221444] 00007fd2e21f19b0 0000000400000064 0000000400048929 ffff8800cb08cf00
[ 6279.222397] 00007fd2e21f19b0 ffff8800379347b8 0000000100048929 0000000000000004
[ 6279.223355] 00007fd2e21f19b0 00000000fea7035d 00000000ffffffe7 000000000000530e
[ 6279.224312] Call Trace:
[ 6279.225260] [<ffffffffa033750a>] cdrom_ioctl+0xd0a/0x10fe [cdrom]
[ 6279.226211] [<ffffffff810f43b0>] ? futex_wake+0x90/0x170
[ 6279.227169] [<ffffffffa036db2c>] sr_block_ioctl+0x7c/0xc0 [sr_mod]
[ 6279.228126] [<ffffffff812ac3a9>] blkdev_ioctl+0x259/0x8f0
[ 6279.229087] [<ffffffff8121b80d>] block_ioctl+0x3d/0x50
[ 6279.230050] [<ffffffff811f3258>] do_vfs_ioctl+0x298/0x480
[ 6279.231006] [<ffffffff811fd307>] ? __fget+0x77/0xb0
[ 6279.231964] [<ffffffff811f34b9>] SyS_ioctl+0x79/0x90
[ 6279.232917] [<ffffffff815970ee>] entry_SYSCALL_64_fastpath+0x12/0x6d
[ 6279.233876] Code: 87 20 01 00 00 49 8b 47 68 48 8b 71 18 31 c9 48 89 45 a0 e8 81 c2 f6 e0 85 c0 74 55 49 8b 87 30 01 00 00 48 8b 4d b0 48 8b 7d a0 <0f> b6 40 02 83 e0 0f 88 41 60 e8 1f bf f6 e0 85 c0 0f 84 6b 01
[ 6279.234927] RIP [<ffffffffa0336662>] mmc_ioctl_cdrom_read_audio+0x212/0x3b0 [cdrom]
[ 6279.235962] RSP <ffff880213ca7cc8>
[ 6279.236970] CR2: 0000000000000002
[ 6279.238039] ---[ end trace 40456d9499d09171 ]---


This looks to me like a kernel panic!

But when I'm in text mode, the system does not die. At first glance, it
would appear to have survived in tack. But the first oddity I then note
is that I cannot retrieve the CD until I reboot. I can return to
graphics mode, and thing otherwise look normal. But in text mode, I see
periodic messages like these.


[ 6512.976407] INVALID input: IN=enp3s0 OUT= MAC=90:2b:34:9c:af:2a:20:2b:c1:54:8c:fe:08:00 SRC=66.102.1.95 DST=192.168.2.101 LEN=40 TOS=0x00 PREC=0x00 TTL=45 ID=37349 PROTO=TCP SPT=443 DPT=56384 WINDOW=0 RES=0x00 RST URGP=0
[ 8358.975784] INVALID input: IN=enp3s0 OUT= MAC=90:2b:34:9c:af:2a:20:2b:c1:54:8c:fe:08:00 SRC=173.194.116.195 DST=192.168.2.101 LEN=40 TOS=0x00 PREC=0x00 TTL=57 ID=23870 PROTO=TCP SPT=443 DPT=44272 WINDOW=0 RES=0x00 RST URGP=0
[ 8358.977537] INVALID input: IN=enp3s0 OUT= MAC=90:2b:34:9c:af:2a:20:2b:c1:54:8c:fe:08:00 SRC=173.194.116.195 DST=192.168.2.101 LEN=40 TOS=0x00 PREC=0x00 TTL=57 ID=23871 PROTO=TCP SPT=443 DPT=44272 WINDOW=0 RES=0x00 RST URGP=0
[ 8358.978583] INVALID input: IN=enp3s0 OUT= MAC=90:2b:34:9c:af:2a:20:2b:c1:54:8c:fe:08:00 SRC=173.194.116.164 DST=192.168.2.101 LEN=40 TOS=0x00 PREC=0x00 TTL=57 ID=7372 PROTO=TCP SPT=443 DPT=42280 WINDOW=0 RES=0x00 RST URGP=0
[ 8358.979639] INVALID input: IN=enp3s0 OUT= MAC=90:2b:34:9c:af:2a:20:2b:c1:54:8c:fe:08:00 SRC=173.194.116.164 DST=192.168.2.101 LEN=40 TOS=0x00 PREC=0x00 TTL=57 ID=7373 PROTO=TCP SPT=443 DPT=42280 WINDOW=0 RES=0x00 RST URGP=0
[ 8406.244568] IN= OUT=enp3s0 SRC=192.168.2.101 DST=194.25.134.106 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=35409 DF PROTO=TCP SPT=53608 DPT=25 WINDOW=29200 RES=0x00 SYN URGP=0
[ 8407.247654] IN= OUT=enp3s0 SRC=192.168.2.101 DST=194.25.134.106 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=35410 DF PROTO=TCP SPT=53608 DPT=25 WINDOW=29200 RES=0x00 SYN URGP=0
[ 8409.251041] IN= OUT=enp3s0 SRC=192.168.2.101 DST=194.25.134.106 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=35411 DF PROTO=TCP SPT=53608 DPT=25 WINDOW=29200 RES=0x00 SYN URGP=0
[ 8413.264515] IN= OUT=enp3s0 SRC=192.168.2.101 DST=194.25.134.106 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=35412 DF PROTO=TCP SPT=53608 DPT=25 WINDOW=29200 RES=0x00 SYN URGP=0
[ 8421.291400] IN= OUT=enp3s0 SRC=192.168.2.101 DST=194.25.134.106 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=35413 DF PROTO=TCP SPT=53608 DPT=25 WINDOW=29200 RES=0x00 SYN URGP=0
[ 8437.281886] IN= OUT=enp3s0 SRC=192.168.2.101 DST=194.25.134.42 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=65312 DF PROTO=TCP SPT=36022 DPT=25 WINDOW=29200 RES=0x00 SYN URGP=0
[ 8439.285278] IN= OUT=enp3s0 SRC=192.168.2.101 DST=194.25.134.42 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=65313 DF PROTO=TCP SPT=36022 DPT=25 WINDOW=29200 RES=0x00 SYN URGP=0
[ 8443.292092] IN= OUT=enp3s0 SRC=192.168.2.101 DST=194.25.134.42 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=65314 DF PROTO=TCP SPT=36022 DPT=25 WINDOW=29200 RES=0x00 SYN URGP=0
[ 8451.318997] IN= OUT=enp3s0 SRC=192.168.2.101 DST=194.25.134.42 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=65315 DF PROTO=TCP SPT=36022 DPT=25 WINDOW=29200 RES=0x00 SYN URGP=0
[ 8473.332991] IN= OUT=enp3s0 SRC=192.168.2.101 DST=194.25.134.107 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=48482 DF PROTO=TCP SPT=53614 DPT=25 WINDOW=29200 RES=0x00 SYN URGP=0

At a guess, the kernel is writing to the active tty what systemd writes
to the journal. (They are witten to the journal as well.)

The final oddity is when I reboot. Normally, a reboot takes less than a
second, and everything written to the monitor happens too fast to read,
and very little gets written anyway. But after alsaplayer has put my
Linux in its "unusual state", I see lots of messages from systemd coming
at intervals that can be measured in seconds. It's trying to
shut down systemd services.

I have a second computer which behaves in exactly the same way. Also,
this is not a new phenomenon; it's probably been happening for years,
certainly many months. I at first thought that the problem is so
extreme that no bug report was necessary; it would be corrected with the
next release. But the problem continues over several updates.

My current guess is that Gnome and KDE probably have a mechanism for
catching problems like this, but naked Xorg X11 cannot, so I see the
problem that others do not.

Cheers,
Mike Dowling


Additional info:
* package version(s)
* config and/or log files etc.


Steps to reproduce:
Plac an audio CD to the end.
This task depends upon

Closed by  Doug Newgard (Scimmia)
Friday, 20 October 2017, 14:51 GMT
Reason for closing:  No response
Comment by Maxime Morel (popux) - Wednesday, 30 March 2016, 09:26 GMT
Can you unload cdrom module after the crash in text mode?
# modprobe -rv cdrom
Maybe you will have to unload other modules depending on this one first (on my system I have to do: modprobe -rv sr_mod cdrom).
Then reload it:
# modprobe -v cdrom
Comment by Mike Dowling (mdowling) - Wednesday, 30 March 2016, 11:04 GMT
No, I cannot unload cdrom. Instead, I get the message that the module is in use. This explains why I cannot remove the CD after the crash. I could not find any other modules to unload that seemed relevant.
Comment by Maxime Morel (popux) - Wednesday, 30 March 2016, 11:23 GMT
To see the module depending on cdrom module, do:
$ lsmod | grep cdrom

Here is my output (modules using cdrom module are on the right column):
cdrom 49152 1 sr_mod

You tried ejecting with "eject" command or with the hardware button?
I guess using eject command will be blocking, and won't give you back the prompt?
Comment by Mike Dowling (mdowling) - Wednesday, 30 March 2016, 16:44 GMT
To tell the truth, I didn't even know that there was an "eject" command. So, I experiment, and here are the results.

Start a CD using alsaplayer, and call "eject" while it's playing; you're right, eject blocks. It takes "kill -9" to kill the tty, but I can then still log in again.

After using "eject" and trying to reboot, the system hangs for about three minutes but eventually does reboot. In the journal entry to the effect that the system could not umount /home.
(Good that ext4 is a journalling filesystem!)

The kernel still cannot dereference the null pointer.

My output from lsmod is similar to yours:
cdrom 49152 1 sr_mod

Presumably, the the kernel thinks that the CD is still in use, and will not eject. Pressing the hardware button therefore has no effect. The "eject" command waits for the CH to eject, and so hangs, thus blocking the tty or xterm.
Comment by Maxime Morel (popux) - Thursday, 31 March 2016, 08:59 GMT
It seems to be related to that: https://www.spinics.net/lists/linux-usb/msg135163.html
What is your cdrom drive? External connected with USB? Or internal connected with SATA or IDE?
Comment by Mike Dowling (mdowling) - Thursday, 31 March 2016, 09:37 GMT
It's an internal CSDVD recorder. Here's an excerpt from the journal from the boot this morning:

Mar 31 09:32:06 moocow kernel: ata2.00: ATAPI: TSSTcorp CDDVDW SH-224DB, SB01, max UDMA/100
Mar 31 09:32:06 moocow kernel: ata2.00: configured for UDMA/100
Mar 31 09:32:06 moocow kernel: scsi 1:0:0:0: CD-ROM TSSTcorp CDDVDW SH-224DB SB01 PQ: 0 ANSI: 5
Comment by mattia (nTia89) - Monday, 02 October 2017, 20:37 GMT
is this issue still valid?

Loading...