FS#69190 - [linux] kernel 5.10.4 drm/i915/snd_hda_codec_hdmi crash

Attached to Project: Arch Linux
Opened by Vladimir (_v_l) - Sunday, 03 January 2021, 12:21 GMT
Last edited by Toolybird (Toolybird) - Sunday, 04 June 2023, 03:27 GMT
Task Type Bug Report
Category Kernel
Status Closed
Assigned To Jan Alexander Steffens (heftig)
Architecture All
Severity Low
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 1
Private No

Details

Description: drm/i915/snd_hda_codec_hdmi module in kernel 5.10.4 crashes when a monitor connected by HDMI cable goes to power save mode.

Almost identical system connected to a monitor by DVI-HDMI cable works fine.

Additional info:
* kernel: 5.10.4 (arch, zen, ck);
* Xorg: 1.20.10-3
* dmesg output:
[code]
[72129.500531] INFO: task Xorg:698 blocked for more than 1228 seconds.
[72129.500533] Not tainted 5.10.4-1-ck #1
[72129.500533] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[72129.500534] Xorg D stack: 0 pid: 698 ppid: 1 flags:0x00004084
[72129.500536] Call Trace:
[72129.500540] __schedule+0x5f7/0xd60
[72129.500542] schedule+0x5b/0xc0
[72129.500543] schedule_preempt_disabled+0x11/0x20
[72129.500544] __mutex_lock.constprop.0+0x17d/0x4f0
[72129.500548] sync_eld_via_acomp+0x3f/0x350 [snd_hda_codec_hdmi]
[72129.500550] check_presence_and_report+0x57/0x80 [snd_hda_codec_hdmi]
[72129.500578] intel_audio_codec_enable+0x11f/0x180 [i915]
[72129.500598] intel_enable_ddi+0x446/0x5a0 [i915]
[72129.500617] intel_encoders_enable+0x74/0xa0 [i915]
[72129.500634] hsw_crtc_enable+0x1c3/0x5c0 [i915]
[72129.500652] intel_enable_crtc+0x48/0x60 [i915]
[72129.500669] skl_commit_modeset_enables+0x236/0x4e0 [i915]
[72129.500687] intel_atomic_commit_tail+0x2c6/0x1230 [i915]
[72129.500689] ? complete+0x2e/0x40
[72129.500690] ? flush_workqueue_prep_pwqs+0x11e/0x130
[72129.500692] ? flush_workqueue+0x19d/0x3f0
[72129.500709] intel_atomic_commit+0x312/0x390 [i915]
[72129.500721] drm_atomic_connector_commit_dpms+0xd9/0x100 [drm]
[72129.500729] drm_mode_obj_set_property_ioctl+0x1bf/0x420 [drm]
[72129.500731] ? filemap_fault+0x3f3/0x990
[72129.500738] ? drm_connector_set_obj_prop+0x90/0x90 [drm]
[72129.500744] drm_connector_property_set_ioctl+0x39/0x60 [drm]
[72129.500749] drm_ioctl_kernel+0xb3/0x100 [drm]
[72129.500755] drm_ioctl+0x236/0x3d0 [drm]
[72129.500762] ? drm_connector_set_obj_prop+0x90/0x90 [drm]
[72129.500765] __x64_sys_ioctl+0x83/0xb0
[72129.500766] do_syscall_64+0x33/0x40
[72129.500767] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[72129.500769] RIP: 0033:0x7f611e1a8f6b
[72129.500769] RSP: 002b:00007fffbcd740e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[72129.500771] RAX: ffffffffffffffda RBX: 00007fffbcd74120 RCX: 00007f611e1a8f6b
[72129.500771] RDX: 00007fffbcd74120 RSI: 00000000c01064ab RDI: 000000000000000e
[72129.500772] RBP: 00000000c01064ab R08: 00005613fb3f7698 R09: 0000000000000000
[72129.500772] R10: 0000000000000000 R11: 0000000000000246 R12: 00005613fb6194d0
[72129.500772] R13: 000000000000000e R14: 0000000000000000 R15: 0000000000000000
[/code]
* reported upstream as https://gitlab.freedesktop.org/drm/intel/-/issues/2883


Steps to reproduce:
install kernel 5.10.4, connect monitor by HDMI-HDMI cable, wait when monitor goes to power save mode ("sleep"), try to wake up monitor (actually, I found that monitor didn't turn off, seems that something goes on when in trying to go in power save mode but didn't complete the job, Xorg hangs and monitor stays in on mode).

I seems not alone with this problem, I found this post on forum https://bbs.archlinux.org/viewtopic.php?pid=1947358#p1947358 .
This task depends upon

Closed by  Toolybird (Toolybird)
Sunday, 04 June 2023, 03:27 GMT
Reason for closing:  Upstream
Additional comments about closing:  Clearly an upstream issue. If still happening, please follow up by contacting the kernel folks.
Comment by John (graysky) - Sunday, 03 January 2021, 14:33 GMT
This commit[1] helps the monitor successfully awaken from sleep (4x now) but each time, it is accompanied by a kernel opps. Example:

[Jan 3 09:24] ------------[ cut here ]------------
[ +0.000006] WARNING: CPU: 6 PID: 864 at sound/pci/hda/patch_hdmi.c:2136 hdmi_pcm_close+0x1c8/0x1d0 [snd_hda_codec_hdmi]
[ +0.000001] Modules linked in: f2fs overlay dm_crypt cbc encrypted_keys trusted tpm rng_core joydev mousedev hid_microsoft ff_memless usbhid ip6t_REJECT nf_reject_ipv6 iwlmvm mac80211 libarc4 intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp iwlwifi kvm_intel xt_hl ip6t_rt kvm snd_hda_codec_realtek irqbypass crct10dif_pclmul dm_mod crc32_pclmul ghash_clmulni_intel snd_hda_codec_generic iTCO_wdt intel_pmc_bxt at24 snd_hda_codec_hdmi iTCO_vendor_support cfg80211 mei_hdcp ledtrig_audio aesni_intel mxm_wmi snd_hda_intel snd_intel_dspcfg crypto_simd snd_hda_codec cryptd glue_helper rapl snd_hwdep i2c_i801 intel_cstate snd_hda_core pcspkr i2c_smbus intel_uncore lpc_ich snd_pcm e1000e mei_me rfkill snd_timer ipt_REJECT mei nf_reject_ipv4 snd soundcore xt_multiport xt_comment wmi intel_smartconnect acpi_pad mac_hid xt_limit xt_addrtype xt_tcpudp xt_conntrack ip6table_filter ip6_tables nf_conntrack_netbios_ns nf_conntrack_broadcast nf_nat_ftp nf_nat nf_conntrack_ftp
[ +0.000031] nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c iptable_filter xt_iprange xt_mark xt_NFQUEUE nct6775 fuse hwmon_vid coretemp ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 crc32c_intel xhci_pci xhci_pci_renesas i915 video intel_gtt i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec drm agpgart
[ +0.000014] CPU: 6 PID: 864 Comm: pulseaudio Not tainted 5.10.4-1-minimum #1
[ +0.000001] Hardware name: MSI MS-7888/Z97 MPOWER MAX AC (MS-7888), BIOS V1.11 02/16/2016
[ +0.000002] RIP: 0010:hdmi_pcm_close+0x1c8/0x1d0 [snd_hda_codec_hdmi]
[ +0.000001] Code: 4c 89 e7 b9 07 0f 00 00 e8 a5 ec 11 00 0f b7 33 b9 07 07 00 00 31 d2 83 e0 bf 4c 89 e7 41 89 c0 e8 cd eb 11 00 e9 5a ff ff ff <0f> 0b e9 ec fe ff ff 90 0f 1f 44 00 00 41 56 41 89 d6 ba 98 06 00
[ +0.000001] RSP: 0018:ffffaea501347e18 EFLAGS: 00010246
[ +0.000001] RAX: ffff9f2084159900 RBX: 0000000000000002 RCX: 0000000000000000
[ +0.000000] RDX: 0000000000000000 RSI: 0000000000000002 RDI: 0000000000000003
[ +0.000001] RBP: ffff9f2083c64000 R08: ffff9f2084159900 R09: 0000000000000028
[ +0.000000] R10: ffff9f208248e600 R11: 0000000000000000 R12: ffff9f2085787000
[ +0.000001] R13: ffff9f2080e1a1d8 R14: ffff9f208c60fb08 R15: ffff9f2080e1a000
[ +0.000001] FS: 00007f8a02c84800(0000) GS:ffff9f2780380000(0000) knlGS:0000000000000000
[ +0.000000] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ +0.000001] CR2: 00007f3863d84910 CR3: 000000014ab86003 CR4: 00000000001706e0
[ +0.000000] Call Trace:
[ +0.000009] azx_pcm_close+0x75/0xf0 [snd_hda_codec]
[ +0.000006] snd_pcm_release_substream.part.0+0x40/0xd0 [snd_pcm]
[ +0.000002] snd_pcm_release+0x4e/0xb0 [snd_pcm]
[ +0.000003] __fput+0x8e/0x230
[ +0.000002] task_work_run+0x5c/0x90
[ +0.000002] exit_to_user_mode_prepare+0xf7/0x120
[ +0.000002] syscall_exit_to_user_mode+0x28/0x160
[ +0.000002] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ +0.000002] RIP: 0033:0x7f8a03619f9b
[ +0.000001] Code: 8b 15 d9 6e 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb 89 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 0b 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a5 6e 0c 00 f7 d8 64 89 01 48
[ +0.000000] RSP: 002b:00007ffc35e1f1f8 EFLAGS: 00000202 ORIG_RAX: 000000000000000b
[ +0.000001] RAX: 0000000000000000 RBX: 000055cc0ad1c9e8 RCX: 00007f8a03619f9b
[ +0.000000] RDX: 0000000000000010 RSI: 0000000000001000 RDI: 00007f89fe946000
[ +0.000001] RBP: 000055cc0acc84b0 R08: 0000000000000000 R09: 0000000000000010
[ +0.000000] R10: 0000000000000007 R11: 0000000000000202 R12: 0000000000000000
[ +0.000001] R13: 000055cc0ac9f190 R14: 000055cc0ae69940 R15: 000055cc0ae0ef20
[ +0.000001] ---[ end trace e94c5d94d8c46cbf ]---


1. https://git.archlinux.org/linux.git/commit/?h=v5.10.4-arch2&id=00f09a6a8193b46c83ae1c8ff6623db011f90099
Comment by loqs (loqs) - Sunday, 03 January 2021, 20:33 GMT
If you comment out sound/pci/hda/patch_hdmi.c line 2136:
/* snd_BUG_ON(!per_cvt->assigned);*/

Does that produce other issues?
Comment by John (graysky) - Sunday, 03 January 2021, 21:06 GMT
@loqs - Only triggered it once but when the monitor awoke after commenting line 2136, no kernel oops. What other issue would one expect to look for?
Comment by loqs (loqs) - Sunday, 03 January 2021, 21:23 GMT
An oops from somewhere else in the audio code or loss of sound / sound corruption.

Edit
00f09a6a8193b46c83ae1c8ff6623db011f90099 has been picked up in the for-linus branch of sound [1]
If you had free time you could try the head of that branch, as it is newer than any tag.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git/commit/?h=for-linus&id=3d5c5fdcee0f9a94deb0472e594706018b00aa31
Comment by John (graysky) - Monday, 04 January 2021, 12:25 GMT
@loqs - So far building 5.10.4-arch2-1 with line 2136 commented out seems to be behaving normally. I have not yet discovered any ill-effect from it. I will keep testing it a while longer before I try out the for-linus branch of sound. I am admittedly ignorant about it. As the name implies, are changes there for review by Linus? How is that different from a formal PR other bug fixes take?

BTW, I opened a ticket describing this which predates this FS: https://bugzilla.kernel.org/show_bug.cgi?id=210987
Comment by loqs (loqs) - Monday, 04 January 2021, 17:02 GMT
There was no sound PR for 5.11-rc2, there was one for rc1 [1].
I expect a tag will be added to the branch later in the week just before a PR is sent to Linus.
You could wait for 5.11-rc3 see if it has a sound PR and if that has the issue.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=58cf05f597b03a8212d9ecf2c79ee046d3ee8ad9
Comment by John (graysky) - Monday, 04 January 2021, 17:38 GMT
I think I understand. I assume that as fixes are making their way into the 5.11 branch, they are triaged for backporting and ultimate inclusion into 5.10.x RCs. I will also keep an eye on the 5.10.x stable queue[1]. If I am running your suggested modification for a few point releases, I don't see that as a negative so long as there isn't any data loss or other bad things as a result of it.

1. https://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git/tree/queue-5.10
Comment by Gerry Kessler (renegat) - Tuesday, 09 February 2021, 11:38 GMT
Is there any progress in solving this issue?
I've a lot of Intel NUC Gen. 4 'Haswell' (D34010WYK /D54250WYK) on which this problem appears irregulary on boot and resume from sleep.

Linux arch30 5.10.14-arch1-1 #1 SMP PREEMPT Sun, 07 Feb 2021 22:42:17 +0000 x86_64 GNU/Linux

[ 23.828460] ------------[ cut here ]------------
[ 23.828472] WARNING: CPU: 1 PID: 807 at sound/pci/hda/patch_hdmi.c:2136 hdmi_pcm_close+0x1c8/0x1d0 [snd_hda_codec_hdmi]
[ 23.828473] Modules linked in: mousedev btusb btrtl btbcm btintel bluetooth ecdh_generic ecc iwlmvm mac80211 snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi intel_rapl_msr snd_hda_intel intel_rapl_common snd_intel_dspcfg soundwire_intel soundwire_generic_allocation libarc4 8021q soundwire_cadence garp mrp stp nct6775 llc x86_pkg_temp_thermal snd_hda_codec hwmon_vid intel_powerclamp iwlwifi coretemp snd_hda_core kvm_intel snd_hwdep soundwire_bus kvm at24 mei_hdcp snd_soc_core wmi_bmof cfg80211 snd_compress ac97_bus irqbypass snd_pcm_dmaengine rapl intel_cstate snd_pcm intel_uncore i2c_i801 e1000e pcspkr i2c_smbus mei_me snd_timer snd mei lpc_ich rfkill soundcore wmi mac_hid sg crypto_user fuse bpf_preload ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 usbhid dm_crypt cbc encrypted_keys dm_mod trusted tpm rng_core crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd glue_helper xhci_pci xhci_pci_renesas i915
[ 23.828571] video i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec drm intel_agp intel_gtt agpgart
[ 23.828586] CPU: 1 PID: 807 Comm: pulseaudio Not tainted 5.10.14-arch1-1 #1
[ 23.828588] Hardware name: /D54250WYB, BIOS WYLPT10H.86A.0045.2017.0302.2108 03/02/2017
[ 23.828593] RIP: 0010:hdmi_pcm_close+0x1c8/0x1d0 [snd_hda_codec_hdmi]
[ 23.828596] Code: 4c 89 e7 b9 07 0f 00 00 e8 a5 7c c6 ff 0f b7 33 b9 07 07 00 00 31 d2 83 e0 bf 4c 89 e7 41 89 c0 e8 cd 7b c6 ff e9 5a ff ff ff <0f> 0b e9 ec fe ff ff 90 0f 1f 44 00 00 41 56 41 89 d6 ba 98 06 00
[ 23.828598] RSP: 0018:ffffa32bc09e3e18 EFLAGS: 00010246
[ 23.828601] RAX: ffff96ec0328ca00 RBX: 0000000000000002 RCX: 0000000000000000
[ 23.828603] RDX: 0000000000000000 RSI: 0000000000000002 RDI: 0000000000000003
[ 23.828605] RBP: ffff96ec03821000 R08: ffff96ec0328ca00 R09: 0000000000000028
[ 23.828606] R10: ffff96ec004719c0 R11: 0000000000000000 R12: ffff96ec01066800
[ 23.828608] R13: ffff96ec01d459d8 R14: ffff96ec0328c308 R15: ffff96ec01d45800
[ 23.828611] FS: 00007f80029a2800(0000) GS:ffff96ed17a80000(0000) knlGS:0000000000000000
[ 23.828613] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 23.828614] CR2: 000055e85bb99b68 CR3: 000000011f318003 CR4: 00000000001706e0
[ 23.828616] Call Trace:
[ 23.828633] azx_pcm_close+0x75/0xf0 [snd_hda_codec]
[ 23.828642] snd_pcm_release_substream.part.0+0x40/0xd0 [snd_pcm]
[ 23.828649] snd_pcm_release+0x4e/0xb0 [snd_pcm]
[ 23.828654] __fput+0x8e/0x230
[ 23.828658] task_work_run+0x5c/0x90
[ 23.828662] exit_to_user_mode_prepare+0xf7/0x120
[ 23.828668] syscall_exit_to_user_mode+0x28/0x160
[ 23.828671] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 23.828674] RIP: 0033:0x7f800334eddb
[ 23.828677] Code: 8b 15 99 70 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb 89 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 0b 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 65 70 0c 00 f7 d8 64 89 01 48
[ 23.828678] RSP: 002b:00007fff9e4d9298 EFLAGS: 00000202 ORIG_RAX: 000000000000000b
[ 23.828681] RAX: 0000000000000000 RBX: 000055e85bb97538 RCX: 00007f800334eddb
[ 23.828683] RDX: 0000000000000010 RSI: 0000000000001000 RDI: 00007f7ffe630000
[ 23.828684] RBP: 000055e85bb97300 R08: 000055e85bb97110 R09: 00007fff9e4d92a0
[ 23.828686] R10: 000055e85ba1f740 R11: 0000000000000202 R12: 0000000000000000
[ 23.828688] R13: 000055e85bb963e0 R14: 000055e85bb929d0 R15: 000055e85babbcb0
[ 23.828691] ---[ end trace 5aa71e19d21b809e ]---
Comment by John (graysky) - Tuesday, 09 February 2021, 12:19 GMT
Still hitting me on 5.10.14. You can use the work-around by loqs above if you build your own kernel to remove the spam to dmesg.
Comment by Vladimir (_v_l) - Wednesday, 10 February 2021, 00:21 GMT
@renegat and @graysky, it might be a different problem because I didn't have any problem with recent kernels but I see several issues (https://gitlab.freedesktop.org/drm/intel/-/issues) related to X hang.
Comment by loqs (loqs) - Wednesday, 10 February 2021, 18:12 GMT
@graysky is the issue still present in 5.11-rc7? Is there an upstream bug report?
Comment by John (graysky) - Wednesday, 10 February 2021, 18:15 GMT
@loqs - I did report upstream however it was closed as fixed. On my system, it is not fixed. I requested that they reopen it yesterday, link: https://gitlab.freedesktop.org/drm/intel/-/issues/2883

I have not had time to test 5.11-rc7 yet but I will and report back.
Comment by John (graysky) - Wednesday, 10 February 2021, 19:58 GMT
@loqs - I just confirm that the bug is still present under 5.11-rc7.

1. Monitor goes to sleep.
2. User wakes it up and there is an accompanying kernel oops.
Example under 5.11-rc7
[Feb10 14:56] ------------[ cut here ]------------
[ +0.000003] WARNING: CPU: 5 PID: 848 at sound/pci/hda/patch_hdmi.c:2133 hdmi_pcm_close+0x1c8/0x1d0 [snd_hda_codec_hdmi]
[ +0.000007] Modules linked in: f2fs overlay dm_crypt cbc encrypted_keys trusted tpm rng_core joydev mousedev hid_microsoft ff_memless usbhid intel_rapl_msr intel_rapl_common ip6t_REJECT nf_reject_ipv6 iwlmvm snd_hda_codec_realtek snd_hda_codec_generic intel_smartconnect ledtrig_audio snd_hda_codec_hdmi mac80211 x86_pkg_temp_thermal intel_powerclamp snd_hda_intel snd_intel_dspcfg soundwire_intel soundwire_generic_allocation soundwire_cadence snd_hda_codec kvm_intel snd_hda_core iTCO_wdt snd_hwdep soundwire_bus dm_mod intel_pmc_bxt at24 mei_hdcp libarc4 xt_hl iTCO_vendor_support kvm mxm_wmi irqbypass crct10dif_pclmul ip6t_rt crc32_pclmul iwlwifi snd_soc_core ghash_clmulni_intel aesni_intel snd_compress crypto_simd ac97_bus cryptd glue_helper snd_pcm_dmaengine rapl cfg80211 intel_cstate i2c_i801 snd_pcm intel_uncore i2c_smbus e1000e lpc_ich snd_timer pcspkr ipt_REJECT nf_reject_ipv4 mei_me snd xt_multiport mei rfkill soundcore xt_comment wmi mac_hid acpi_pad xt_limit xt_addrtype xt_tcpudp
[ +0.000037] xt_conntrack ip6table_filter ip6_tables nf_conntrack_netbios_ns nf_conntrack_broadcast nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c iptable_filter xt_iprange xt_mark xt_NFQUEUE nct6775 hwmon_vid fuse coretemp ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 crc32c_intel xhci_pci xhci_pci_renesas i915 video intel_gtt i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec drm agpgart
[ +0.000020] CPU: 5 PID: 848 Comm: pulseaudio Not tainted 5.11.0-rc7-1-mainline #1
[ +0.000001] Hardware name: MSI MS-7888/Z97 MPOWER MAX AC (MS-7888), BIOS V1.11 02/16/2016
[ +0.000001] RIP: 0010:hdmi_pcm_close+0x1c8/0x1d0 [snd_hda_codec_hdmi]
[ +0.000003] Code: 4c 89 e7 b9 07 0f 00 00 e8 a5 4c 0b 00 0f b7 33 b9 07 07 00 00 31 d2 83 e0 bf 4c 89 e7 41 89 c0 e8 cd 4b 0b 00 e9 5a ff ff ff <0f> 0b e9 ec fe ff ff 90 0f 1f 44 00 00 41 56 41 89 d6 ba 98 06 00
[ +0.000001] RSP: 0018:ffffa96e00e3be18 EFLAGS: 00010246
[ +0.000002] RAX: ffff9036cc73c700 RBX: 0000000000000002 RCX: 0000000000000000
[ +0.000001] RDX: 0000000000000000 RSI: 0000000000000002 RDI: 0000000000000003
[ +0.000000] RBP: ffff9036cc972e00 R08: ffff9036cc73c700 R09: 0000000000000028
[ +0.000001] R10: ffff9036c640b600 R11: 0000000000000000 R12: ffff9036c418c000
[ +0.000001] R13: ffff9036d05409d8 R14: ffff9036cc73cc08 R15: ffff9036d0540800
[ +0.000001] FS: 00007f417b201800(0000) GS:ffff903dc0340000(0000) knlGS:0000000000000000
[ +0.000001] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ +0.000001] CR2: 00007fd2aec8f910 CR3: 0000000147264005 CR4: 00000000001706e0
[ +0.000001] Call Trace:
[ +0.000003] azx_pcm_close+0x75/0xf0 [snd_hda_codec]
[ +0.000009] snd_pcm_release_substream.part.0+0x40/0xd0 [snd_pcm]
[ +0.000007] snd_pcm_release+0x4e/0xb0 [snd_pcm]
[ +0.000004] __fput+0x85/0x230
[ +0.000003] task_work_run+0x5c/0x90
[ +0.000002] exit_to_user_mode_prepare+0x158/0x160
[ +0.000003] syscall_exit_to_user_mode+0x23/0x50
[ +0.000003] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ +0.000003] RIP: 0033:0x7f417bbadddb
[ +0.000001] Code: 8b 15 99 70 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb 89 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 0b 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 65 70 0c 00 f7 d8 64 89 01 48
[ +0.000001] RSP: 002b:00007ffe9769f648 EFLAGS: 00000202 ORIG_RAX: 000000000000000b
[ +0.000001] RAX: 0000000000000000 RBX: 0000555836921b78 RCX: 00007f417bbadddb
[ +0.000001] RDX: 0000000000000010 RSI: 0000000000001000 RDI: 00007f4176ebb000
[ +0.000001] RBP: 0000555836a27e60 R08: 0000000000000000 R09: 0000000000000010
[ +0.000000] R10: 0000000000000007 R11: 0000000000000202 R12: 0000000000000000
[ +0.000001] R13: 000055583691c5c0 R14: 0000555836945920 R15: 0000555836a8e940
[ +0.000002] ---[ end trace f1a27ac4ba6624b9 ]---
Comment by loqs (loqs) - Wednesday, 10 February 2021, 20:17 GMT
If you do not hear from the i195 devs in a few weeks you could try the author Stephen Warren and committer Takashi Iwai of the patch that introduced the BUG_ON [1].

https://github.com/torvalds/linux/commit/384a48d71520ca569a63f1e61e51a538bedb16df
Comment by John (graysky) - Wednesday, 10 February 2021, 20:46 GMT
Thanks, loqs. I just emailed them both pointing them to the two upstream bug reports and mentioning the commit you called out.

For reference:
https://bugzilla.kernel.org/show_bug.cgi?id=210987
https://gitlab.freedesktop.org/drm/intel/-/issues/2883
Comment by S Shaikh (sshaikh) - Thursday, 25 March 2021, 19:44 GMT
Experiencing this issue with 5.11.9 via Arch. For me the fix was to add two kernel parameters as per:

https://bbs.archlinux.org/viewtopic.php?pid=1958671#p1958671

I still cannot suspend, but DPMS seems to work okay now.

Please let me know if I can contribute in any way.
Comment by loqs (loqs) - Thursday, 25 March 2021, 20:07 GMT
The snd_BUG_ON was removed by 6f294e3e3ebb200fbbf15a08df14e00f11f0567d in 5.11.3 [1] and by 0a7efa3fd7a106b04f90266934423b98fa37f9e6 in 5.10.20 [2] so the issue should have been fixed in 5.11.9.
You produced the same Warning and backtrace?
Edit:
The original issue was fixed by [3] in 5.10.4-arch2 and backported by upstream as adee1c5126ef0aa7951e0ba101b73a3cd6732c09 in 5.10.6 [4].

[1] https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=6f294e3e3ebb200fbbf15a08df14e00f11f0567d
[2] https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=0a7efa3fd7a106b04f90266934423b98fa37f9e6
[3] https://git.archlinux.org/linux.git/commit/?id=00f09a6a8193b46c83ae1c8ff6623db011f90099
[4] https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit?id=adee1c5126ef0aa7951e0ba101b73a3cd6732c09
Comment by S Shaikh (sshaikh) - Friday, 26 March 2021, 11:58 GMT
I've described the issue at:

https://bbs.archlinux.org/viewtopic.php?pid=1963711

To clarify I had only updated to 5.11.9 after I posted the logs, but I do update every few days so am confident I was running > 5.10.4 when the logs were generated.

I do think that it still has something to do with snd_hda_codec_hdmi, as the kernel parameters that help me suppress the "failed to power up codec" startup message that seems to be correlated with the wake up issue.

Happy to go through any standard steps to reproduce the issue if it helps this bug - my own experiments were hardly scientific.
Comment by S Shaikh (sshaikh) - Friday, 26 March 2021, 19:45 GMT
I repeated my tests with the latest kernel, and can confirm that I no longer see any stack traces or errors in the logs apart from

[ 3.921538] snd_hda_codec_hdmi hdaudioC1D2: Monitor plugged-in, Failed to power up codec ret=[-13]
[ 12.900494] i915 0000:00:02.0: [drm] *ERROR* Failed to read TMDS config: -6

However the actual wakeup behaviour still remains and I wake up to a black screen and locked keyboard from both DPMS and systemctl suspend, and am required to hard shut down with 4 seconds on the power button to restart.

I've also noticed a similar crash although it's not clear to me when it happens - perhaps shutdown? Although I don't notice any delay in doing that:

Mar 26 17:34:42 desktop systemd-logind[299]: Failed to abandon session scope, ignoring: Connection timed out
Mar 26 17:34:42 desktop systemd-logind[299]: Session 3 logged out. Waiting for processes to exit.
Mar 26 17:34:55 desktop kernel: INFO: task Xorg:499 blocked for more than 122 seconds.
Mar 26 17:34:55 desktop kernel: Not tainted 5.11.9-arch1-1 #1
Mar 26 17:34:55 desktop kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mar 26 17:34:55 desktop kernel: task:Xorg state:D stack: 0 pid: 499 ppid: 498 flags:0x00004004
Mar 26 17:34:55 desktop kernel: Call Trace:
Mar 26 17:34:55 desktop kernel: __schedule+0x2dd/0x8b0
Mar 26 17:34:55 desktop kernel: schedule+0x5b/0xc0
Mar 26 17:34:55 desktop kernel: schedule_preempt_disabled+0x11/0x20
Mar 26 17:34:55 desktop kernel: __ww_mutex_lock.constprop.0+0x42c/0x810
Mar 26 17:34:55 desktop kernel: drm_modeset_lock+0x31/0xb0 [drm]
Mar 26 17:34:55 desktop kernel: glk_force_audio_cdclk+0xa3/0x1a0 [i915]
Mar 26 17:34:55 desktop kernel: i915_audio_component_get_power+0xe0/0x100 [i915]
Mar 26 17:34:55 desktop kernel: snd_hdac_display_power+0xd7/0x150 [snd_hda_core]
Mar 26 17:34:55 desktop kernel: __azx_runtime_resume+0x1d/0xe0 [snd_hda_intel]
Mar 26 17:34:55 desktop kernel: azx_runtime_resume+0x3d/0xa0 [snd_hda_intel]
Mar 26 17:34:55 desktop kernel: pci_pm_runtime_resume+0xaa/0xc0
Mar 26 17:34:55 desktop kernel: ? pci_pm_freeze_noirq+0x100/0x100
Mar 26 17:34:55 desktop kernel: ? pci_pm_freeze_noirq+0x100/0x100
Mar 26 17:34:55 desktop kernel: __rpm_callback+0xc5/0x170
Mar 26 17:34:55 desktop kernel: ? pci_pm_freeze_noirq+0x100/0x100
Mar 26 17:34:55 desktop kernel: rpm_callback+0x1f/0x70
Mar 26 17:34:55 desktop kernel: rpm_resume+0x5c4/0x810
Mar 26 17:34:55 desktop kernel: rpm_resume+0x308/0x810
Mar 26 17:34:55 desktop kernel: __pm_runtime_resume+0x3b/0x60
Mar 26 17:34:55 desktop kernel: sync_eld_via_acomp+0xe7/0x350 [snd_hda_codec_hdmi]
Mar 26 17:34:55 desktop kernel: check_presence_and_report+0x57/0x80 [snd_hda_codec_hdmi]
Mar 26 17:34:55 desktop kernel: intel_audio_codec_enable+0x12a/0x1a0 [i915]
Mar 26 17:34:55 desktop kernel: intel_enable_ddi+0x450/0x580 [i915]
Mar 26 17:34:55 desktop kernel: ? fwtable_write32+0x4c/0x240 [i915]
Mar 26 17:34:55 desktop kernel: intel_encoders_enable+0x80/0xa0 [i915]
Mar 26 17:34:55 desktop kernel: hsw_crtc_enable+0x1f2/0x760 [i915]
Mar 26 17:34:55 desktop kernel: intel_enable_crtc+0x59/0x70 [i915]
Mar 26 17:34:55 desktop kernel: skl_commit_modeset_enables+0x271/0x550 [i915]
Mar 26 17:34:55 desktop kernel: intel_atomic_commit_tail+0x397/0x13a0 [i915]
Mar 26 17:34:55 desktop kernel: ? flush_workqueue_prep_pwqs+0x117/0x130
Mar 26 17:34:55 desktop kernel: ? flush_workqueue+0x19d/0x3f0
Mar 26 17:34:55 desktop kernel: intel_atomic_commit+0x333/0x3b0 [i915]
Mar 26 17:34:55 desktop kernel: drm_atomic_connector_commit_dpms+0xda/0x100 [drm]
Mar 26 17:34:55 desktop kernel: drm_mode_obj_set_property_ioctl+0x196/0x3d0 [drm]
Mar 26 17:34:55 desktop kernel: ? __schedule+0x2e5/0x8b0
Mar 26 17:34:55 desktop kernel: ? drm_connector_set_obj_prop+0x90/0x90 [drm]
Mar 26 17:34:55 desktop kernel: drm_connector_property_set_ioctl+0x39/0x60 [drm]
Mar 26 17:34:55 desktop kernel: drm_ioctl_kernel+0xb2/0x100 [drm]
Mar 26 17:34:55 desktop kernel: drm_ioctl+0x215/0x390 [drm]
Mar 26 17:34:55 desktop kernel: ? drm_connector_set_obj_prop+0x90/0x90 [drm]
Mar 26 17:34:55 desktop kernel: __x64_sys_ioctl+0x83/0xb0
Mar 26 17:34:55 desktop kernel: do_syscall_64+0x33/0x40
Mar 26 17:34:55 desktop kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9
Mar 26 17:34:55 desktop kernel: RIP: 0033:0x7f58d244ce6b
Mar 26 17:34:55 desktop kernel: RSP: 002b:00007ffe6d374138 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
Mar 26 17:34:55 desktop kernel: RAX: ffffffffffffffda RBX: 00007ffe6d374170 RCX: 00007f58d244ce6b
Mar 26 17:34:55 desktop kernel: RDX: 00007ffe6d374170 RSI: 00000000c01064ab RDI: 000000000000000a
Mar 26 17:34:55 desktop kernel: RBP: 00000000c01064ab R08: 0000561bea03de88 R09: 0000000000000000
Mar 26 17:34:55 desktop kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 0000561bea249ae0
Mar 26 17:34:55 desktop kernel: R13: 000000000000000a R14: 0000000000000000 R15: 0000000000000000
Comment by Jan Alexander Steffens (heftig) - Wednesday, 21 April 2021, 11:32 GMT
Still an issue?
Comment by John (graysky) - Wednesday, 21 April 2021, 11:54 GMT
@heftig - Not for me; fixed for a while now upstream.
Comment by S Shaikh (sshaikh) - Wednesday, 05 May 2021, 08:26 GMT
  • Field changed: Percent Complete (100% → 0%)
I still see the issue with 5.11.15-arch1-2, and still various kernel parameters to avoid the crash.

Happy to verify its the same problem before reopening however.

Loading...