FS#45369 - [linux] corruption during MST hotplug / MST dp power

Attached to Project: Arch Linux
Opened by Jeff Mickey (codemac) - Thursday, 18 June 2015, 01:58 GMT
Last edited by Doug Newgard (Scimmia) - Friday, 04 September 2015, 00:05 GMT
Task Type Bug Report
Category Kernel
Status Closed
Assigned To Tobias Powalowski (tpowa)
Thomas Bächler (brain0)
Architecture x86_64
Severity High
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

Currently when I use my Lenovo t440s with a lenovo dock, I can use the monitor successfully, but when I undock the laptop, everything freezes. This has been happening for a long while[0], and I had never successfully had the journal actually flush and finish writing out the actual panic trace. You'll see that pasted here below.

[0]: https://bbs.archlinux.org/viewtopic.php?pid=1537750#p1537750

The reproduction steps:

- t440s plugged into dock with a single DisplayPort (full sized) attached to a monitor

- Boot t440s

- run startx

- run xterm and type 'xrandr --auto && xrandr --output DP2-1 --off'

- Kernel panics, computer freezes and is unresponsive until hard reset.

It seems to be related to DisplayPort MST hotplugging fucking up some pointer (notably the struct drm_dp_mst_branch *mstb argument to drm_dp_check_and_send_link_address). It seems to only be called in various init functions that I imagine relate to MST devices existing and not existing?

I apologize in advance if this bug is misplaced.

Additional info:

* package version(s)

linux 4.0.5-1

* config and/or log files etc.

; journalctl -b -1 -k -n 50
-- Logs begin at Wed 2014-02-26 12:05:36 PST, end at Wed 2015-06-17 18:48:51 PDT. --
Jun 17 18:31:40 nevada kernel: sound hdaudioC0D0: HDMI: Unknown ELD version 9
Jun 17 18:31:45 nevada kernel: [drm] GMBUS [i915 gmbus dpb] timed out, falling back to bit banging on pin 5
Jun 17 18:32:00 nevada kernel: cfg80211: Calling CRDA for country: US
Jun 17 18:32:00 nevada kernel: cfg80211: Calling CRDA for country: US
Jun 17 18:32:02 nevada kernel: cfg80211: Calling CRDA for country: US
Jun 17 18:32:02 nevada kernel: cfg80211: Calling CRDA for country: US
Jun 17 18:32:02 nevada kernel: cfg80211: Calling CRDA for country: US
Jun 17 18:32:02 nevada kernel: cfg80211: Calling CRDA for country: US
Jun 17 18:32:27 nevada kernel: fuse init (API version 7.23)
Jun 17 18:33:08 nevada kernel: BUG: unable to handle kernel NULL pointer dereference at 000000000000004c
Jun 17 18:33:08 nevada kernel: IP: [<ffffffffa05a7133>] drm_dp_check_and_send_link_address+0x13/0xa0 [drm_kms_helper]
Jun 17 18:33:08 nevada kernel: PGD 0
Jun 17 18:33:08 nevada kernel: Oops: 0000 [#1] PREEMPT SMP
Jun 17 18:33:08 nevada kernel: Modules linked in: fuse ctr ccm xt_addrtype xt_conntrack ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 iptable_filter ip_tables x_tables nf_nat nf_conntrack overlay tun nls_iso8859_1 nls_cp437 vfat fat j
Jun 17 18:33:08 nevada kernel: drm_kms_helper cfg80211 snd_hda_controller memstick thinkpad_acpi drm snd_hda_codec wmi thermal snd_hwdep snd_pcm intel_gtt nvram snd_timer led_class i2c_algo_bit e1000e tpm_tis battery rfkill button snd ac tpm mei_me i2c_core video ptp mei pps_core s
Jun 17 18:33:08 nevada kernel: CPU: 1 PID: 548 Comm: kworker/1:3 Tainted: G O 4.0.5-1-ARCH #1
Jun 17 18:33:08 nevada kernel: Hardware name: LENOVO 20AQCTO1WW/20AQCTO1WW, BIOS GJET67WW (2.17 ) 12/10/2013
Jun 17 18:33:08 nevada kernel: Workqueue: events_long drm_dp_mst_link_probe_work [drm_kms_helper]
Jun 17 18:33:08 nevada kernel: task: ffff880309a09440 ti: ffff880309af8000 task.ti: ffff880309af8000
Jun 17 18:33:08 nevada kernel: RIP: 0010:[<ffffffffa05a7133>] [<ffffffffa05a7133>] drm_dp_check_and_send_link_address+0x13/0xa0 [drm_kms_helper]
Jun 17 18:33:08 nevada kernel: RSP: 0018:ffff880309afbdc8 EFLAGS: 00010286
Jun 17 18:33:08 nevada kernel: RAX: ffff88031e258205 RBX: ffff88030e775300 RCX: ffff88031e253658
Jun 17 18:33:08 nevada kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88003776b600
Jun 17 18:33:08 nevada kernel: RBP: ffff880309afbde8 R08: ffff88031e253640 R09: 0000000000000001
Jun 17 18:33:08 nevada kernel: R10: 0000000000000002 R11: 00000000ffff146c R12: ffff88031e253640
Jun 17 18:33:08 nevada kernel: R13: ffff88003776b600 R14: 0000000000000000 R15: ffff88003776b9d8
Jun 17 18:33:08 nevada kernel: FS: 0000000000000000(0000) GS:ffff88031e240000(0000) knlGS:0000000000000000
Jun 17 18:33:08 nevada kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 17 18:33:08 nevada kernel: CR2: 000000000000004c CR3: 000000000180b000 CR4: 00000000001407e0
Jun 17 18:33:08 nevada kernel: Stack:
Jun 17 18:33:08 nevada kernel: ffff88030e775300 ffff88031e253640 ffff88031e258200 0000000000000000
Jun 17 18:33:08 nevada kernel: ffff880309afbdf8 ffffffffa05a71dc ffff880309afbe48 ffffffff8108da3b
Jun 17 18:33:08 nevada kernel: ffff88031e253640 0000000000000000 ffff88031e253658 ffff88030e775330
Jun 17 18:33:08 nevada kernel: Call Trace:
Jun 17 18:33:08 nevada kernel: [<ffffffffa05a71dc>] drm_dp_mst_link_probe_work+0x1c/0x20 [drm_kms_helper]
Jun 17 18:33:08 nevada kernel: [<ffffffff8108da3b>] process_one_work+0x14b/0x470
Jun 17 18:33:08 nevada kernel: [<ffffffff8108e188>] worker_thread+0x48/0x4b0
Jun 17 18:33:08 nevada kernel: [<ffffffff8108e140>] ? init_pwq.part.7+0x10/0x10
Jun 17 18:33:08 nevada kernel: [<ffffffff8108e140>] ? init_pwq.part.7+0x10/0x10
Jun 17 18:33:08 nevada kernel: [<ffffffff81093418>] kthread+0xd8/0xf0
Jun 17 18:33:08 nevada kernel: [<ffffffff81093340>] ? kthread_worker_fn+0x170/0x170
Jun 17 18:33:08 nevada kernel: [<ffffffff8157a4d8>] ret_from_fork+0x58/0x90
Jun 17 18:33:08 nevada kernel: [<ffffffff81093340>] ? kthread_worker_fn+0x170/0x170
Jun 17 18:33:08 nevada kernel: Code: 94 f7 ff e9 53 fe ff ff b8 f4 ff ff ff e9 72 fe ff ff 66 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 e5 41 56 41 55 41 54 53 49 89 fd <80> 7e 4c 00 49 89 f6 74 6c 49 8b 46 18 4d 8d 66 18 49 39 c4 48
Jun 17 18:33:08 nevada kernel: RIP [<ffffffffa05a7133>] drm_dp_check_and_send_link_address+0x13/0xa0 [drm_kms_helper]
Jun 17 18:33:08 nevada kernel: RSP <ffff880309afbdc8>
Jun 17 18:33:08 nevada kernel: CR2: 000000000000004c
Jun 17 18:33:08 nevada kernel: ---[ end trace f8467360a3bcc93e ]---
Jun 17 18:33:08 nevada kernel: BUG: unable to handle kernel paging request at ffffffffffffffd8
Jun 17 18:33:08 nevada kernel: IP: [<ffffffff81093920>] kthread_data+0x10/0x20


Steps to reproduce:
This task depends upon

Closed by  Doug Newgard (Scimmia)
Friday, 04 September 2015, 00:05 GMT
Reason for closing:  Fixed
Additional comments about closing:  linux 4.1.6-1
Comment by Jeff Mickey (codemac) - Thursday, 18 June 2015, 02:44 GMT
Ahh, and repro'd with drm.debug=6 for more output. See attached file.
Comment by Jeff Mickey (codemac) - Thursday, 18 June 2015, 18:26 GMT
The closest thing I found to a bug about this was on fdo[0]. It's still marked as NEW, and I'm not sure the main MST support author (Dave Airlie) has the info or background data they need.

[0]: https://bugs.freedesktop.org/show_bug.cgi?id=89366
Comment by Jeff Mickey (codemac) - Thursday, 03 September 2015, 23:48 GMT
This specific issue has been fixed in 4.1.6 as far as I can tell. I can no longer reproduce.

Loading...