FS#59091 - [linux] X intermittent input event handling (mostly mouse) inside virtualbox

Attached to Project: Arch Linux
Opened by Radu Pralea (rpralea) - Wednesday, 20 June 2018, 20:26 GMT
Last edited by Doug Newgard (Scimmia) - Wednesday, 04 July 2018, 01:30 GMT
Task Type Bug Report
Category Kernel
Status Closed
Assigned To Tobias Powalowski (tpowa)
Jan Alexander Steffens (heftig)
Architecture x86_64
Severity High
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 17
Private No

Details

Description:
Upgrading Arch guests (VirtualBox) to xorg-server 1.20.0-8 (both Windows and MacOS hosts) intermittent input event handling (mostly mouse clicks, but also pointer position and icon update and sometimes keyboard)

Several reports of this issue were already reported on a dedicated mail thread:
[arch-general] Upgrade to Linux 4.17.2 & xorg-server 1.20.0-8 breaks left-mouse in remote Linux desktops

Doesn't seem to be from libinput (which also received a concomitant update to version 1.11.1-1): despite the windowing system not reacting all mouse events are displayed by libinput debug-events.
Using xev for logging X events, the results corresponds to the windowing system erratic behavior: there are no events generated in X by many (most of the) libinput mouse (and sometimes keyboard) events.

Additional info:
* package version(s): xorg-server 1.20.0-8, libinput 1.11.1-1, linux 4.17.2

Steps to reproduce:
Upgrade to latest xorg, libinput, kernel, start an X session in an Arch guest and play around with some windows (starting programs, resizing and moving/dragging windows). Soon the desktop becomes inoperable by mouse. A couple of times keyboard stopped working too (the system was still running fine, it wasn't frozen).

Some of the impacted users downgraded xorg-server (or/and xorg-server-common?) or the kernel (to 4.16) and corresponding VirtualBox guest modules to go back to a usable desktop.
This task depends upon

Closed by  Doug Newgard (Scimmia)
Wednesday, 04 July 2018, 01:30 GMT
Reason for closing:  Fixed
Additional comments about closing:  linux 4.17.4-1
Comment by Frederic Bezies (fredbezies) - Thursday, 21 June 2018, 13:04 GMT
It is not only related to Archlinux-based system. Seeing same kind of bug with OpenSuSE Tumbleweed. Opened a bug on VirtualBox tracker.

https://www.virtualbox.org/ticket/17827

My only "workaround" for now: blacklisting vboxguest kernel module.
Comment by David C. Rankin (drankinatty) - Thursday, 21 June 2018, 22:33 GMT
I can confirm after update to xorg 1.20.0-8 (and Linux 4.17.2) VirtualBox guests, (Archlinux guests) on Archlinux Host, ignore left mouse button input when accessed via rdesktop. (it may apply to local guests as well). This is with the latest Virtualbox 5.2.12. Multiple users have confirmed this and similar behavior (see: June 18, 2018 thread - [arch-general] Upgrade to Linux 4.17.2 & xorg-server 1.20.0-8 breaks left-mouse in remote Linux).

I virtualize guest so that they are available over my LAN starting each with VBoxManage startvm "VM_name" --type headless. After xorg update there is no way to use the left-mouse button to interface with Linux guests (Win7 guests continue to work fine). Keyboard input seems to work OK (I was able to use up-arrow, down-arrow to exit fluxbox). Console input is fine.

If a window is opened in the remote guest (e.g. xterm), there is no way to left-click and drag to move the window, and clicking on the toolbar controls does nothing.

Prior to xorg 1.20.0-8 (and Linux 4.17.2), this setup has been flawless for years.
Comment by Frederic Bezies (fredbezies) - Friday, 22 June 2018, 12:12 GMT
Looks like a fix had been commited: https://www.virtualbox.org/changeset/72641/vbox

Or a fix to get additions to be built against linux 4.17.

But when I tried to build VirtualBox from svn (using virtualbox-svn AUR package), it doesn't build because of QT 5.11.x...

"/home/fred/virtualbox-svn/src/VirtualBox/src/VBox/Frontends/VirtualBox/src/settings/global/UIGlobalSettingsProxy.cpp: In member function 'void UIGlobalSettingsProxy::prepare()':
/home/fred/virtualbox-svn/src/VirtualBox/src/VBox/Frontends/VirtualBox/src/settings/global/UIGlobalSettingsProxy.cpp:215:59: error: invalid use of incomplete type 'class QButtonGroup'
QButtonGroup *pButtonGroup = new QButtonGroup(this);"

Ouch!

Comment by Doug Newgard (Scimmia) - Friday, 22 June 2018, 16:31 GMT
I really doubt that's the fix, since we aren't even using that module. vboxguest and vboxvideo are in the kernel tree now. The question is, is this an issue with those modules or with the userspace tools.
Comment by David C. Rankin (drankinatty) - Friday, 22 June 2018, 18:55 GMT
Updates to libinput-1.11.1-2 and xorg-server-1.20.0-9 do not fix the problem. After upgrading both Arch host and Arch guest, the problem remains. There is no left-mouse input on virtualbox when accessing arch guests over rdesktop. Moreover, it is like the cursor position doesn't even register anymore. Moving the mouse over the menu brought up by the right-mouse click does not even highlight menu entries anymore, and focus-follows-mouse (doesn't)
Comment by David C. Rankin (drankinatty) - Friday, 22 June 2018, 19:19 GMT
For coordination purposes, there is an open bug on the Oracle bugzilla as well:

https://www.virtualbox.org/ticket/17827

It is looking like a Linux 4.17 issue, as similar problems are reported with openSuSE Tumbleweed with 4.17 and older xorg, and on Ubuntu as well, but it is unclear if there is enough information to rule the kernel, libinput or xorg out.
Comment by Daniel (8472) - Tuesday, 26 June 2018, 15:05 GMT
@David C. Rankin (drankinatty) - I'm running on LTS (currently 4.14.*) and experiencing the same problems.
Comment by Layne (xente) - Wednesday, 27 June 2018, 14:56 GMT
I installed linux 4.16.13-2 from ALA and everything seems to be working fine. Looks like it is 4.17.
Comment by Jeff Hodd (jghodd) - Friday, 29 June 2018, 05:34 GMT
I discovered an interesting facet of this issue. The mouse problem goes away if you send an ACPI shutdown to your virtual machine, wait for it to log out (by timeout), and log back in again. Everything works from that point on until the next shutdown/reboot. Just an FYI in case it helps find the solution. I'll do more testing of this tomorrow.
Comment by Jeff Hodd (jghodd) - Saturday, 30 June 2018, 21:38 GMT
A couple of observations:

1. on first login, you get one left-click which still leaves mouse-over events active.
2. if you right-click, you lose left-click and mouse-over events
3. if you use your left-click to open the KDE menu, you can continue forever to use the left-click to open the KDE menu, but right-click and mouse-over are dead.
4. you must use your "bonus" left-click on first login before logging out and back in to get a fully functional desktop. if you boot up into your desktop and immediately use the ACPI Shutdown to force a logout, the left-click issue persists into the next desktop session.

I have log files from the initial login and the subsequent functional logout/login. There is no apparent difference between the Xorg.0.log files and I can see nothing obvious in the other logs.

Also, linux 4.17.3 does not fix this issue.

Another thought - if you have to use one left-click to force a reset on the next session, it seems to me that that first left-click is breaking something that subsequently needs to be reinitialized or reloaded. if you simply logout then login without using your one left-click, the issue persists across desktop sessions until that one left-click is used.
Comment by David C. Rankin (drankinatty) - Saturday, 30 June 2018, 23:45 GMT
Confirmed - linux-4.17.3-1 does not fix the issue.

Updated both Arch host and Arch guest to linux-4.17.3-1. Left mouse and focus-follows-mouse is ignored. I can confirm left mouse is active before rt-click but after leaving the window where the left click was used BOTH left-mouse and right-mouse are Inoperative. That is bewildering.

The way I tested was to start Fluxbox and I have the panel set to Auto-Hide. The only way to raise the panel is mouse-over.

When fluxbox start -- if my first motion is to move the mouse over the panel, the panel will raise and I can click with the left-mouse over the triangle to cycle Desktop 1->4.

However as soon as the mouse leaves the panel all Left-Mouse AND Right-Mouse inputs and focus-follows-mouse is lost. I cannot raise the panel a second time, and now, with the Right-mouse lost I cannot access the fluxbox menu.

Thankfully, I can Alt+F2 to bring up fbrun and enter 'sudo systemctl poweroff' to shut the VM down gracefully.

Comment by David C. Rankin (drankinatty) - Sunday, 01 July 2018, 00:26 GMT
I can also confirm as Jeff Hodd posted that once you start your desktop, if you are able to exit the desktop and then restart the desktop, mouse functionality returns and appears to work normally. Which is equally bewildering.

For this test, I started fluxbox, it opened, and while I had not used the mouse I could mouseover and raise the hidden panel, but as soon as I right-clicked to bring up the menu -- the left mouse functionality died. I exiting fluxbox by bringing up the menu with the rt-mouse and then used the arrow-keys to exit fluxbox.

I then restarted fluxbox (I use startx) and on the second starting, the mouse continued to work. I have used the left-mouse, middle-mouse, right-mouse and mouse-wheel (to switch desktops 1->4) and the mouse remained active and working normally.

I can't even venture a guess as to how or why this could occur. It seems too consistent between people reporting to be Undefined Behavior, but on the other hand you could see how some kernel pointer left with an indeterminate value could cause the mouse event/input state to get screwed up as well. This is going to be one that the smart people have to figure out (which will probably end of being the same people responsible for the behavior as well...)
Comment by jbding (goldenhawking) - Sunday, 01 July 2018, 04:45 GMT
You can activate the event by clicking the right mouse button. It seems that some windows do not get the focus at startup and therefore cannot respond to events. When the right button is clicked, the shortcut menu pops up, changing the focus again, which causes the overwrite relationship between the Windows to return to normal.
Comment by jbding (goldenhawking) - Sunday, 01 July 2018, 05:04 GMT
The process tree may have a great effect on the phenomenon of failure. Applications started from Docky are worse than the same application started from the Application menu. Using ALT + TAB, you can reactivate some events, but it is useless for terminal windows.

I don't know the mechanism of the focus relationship in the Xorg window System. It seems that there is a "hidden" top-most window that robs the mouse focus at start up. Or, there is a problem with the geometry position calculation . Is it possible that the GUI coordinate offset of the parent process is not set correctly at the beginning ?

My understanding of the Linux system is too elementary, sincerely hope that you archlinux experts find the reason!
Comment by ilya (leniviy) - Sunday, 01 July 2018, 19:47 GMT
Is this reported upstream? Too bad Ubuntu still 4.15
Comment by loqs (loqs) - Sunday, 01 July 2018, 21:10 GMT
@lenivy there is a link to an upstream report at virtualbox in the first comment or do you mean upstream X or linux?
No one with the issue has bisected 4.16-4.17 to locate the commit causing the issue which would gain more attention should it be reported to upstream linux.
Comment by Jeff Hodd (jghodd) - Monday, 02 July 2018, 14:37 GMT
The only change to the vbox kernel code for 4.17.x is to vboxvideo/vbox_ttm.c - ttm is a generic gpu memory manager. None of the changes appear to be suspicious, however. Modifications were made to match API changes in drm/ttm. There is nothing in this code that would seem to apply to a pointer/mouse specifically, and I would expect that if any part of this code failed, it would bring vboxvideo down with it. And we're not seeing that. So, I find it far less likely that this is a linux kernel bug than it is a virtualbox bug.

I doubt very much this is an X issue because if it were, we'd be seeing similar failures on non-virtual platforms as well. And we're not seeing that either.

Perhaps we need to apply some pressure to the open virtualbox bug (the URL for which is listed above) and concentrate on what's happening in the host logs.
Comment by Frederic Bezies (fredbezies) - Monday, 02 July 2018, 16:25 GMT
If you want to see this bug corrected, you have to ask them to modify GUI code to get build against QT 5.11. See this other bug I opened: https://www.virtualbox.org/ticket/17835
Comment by loqs (loqs) - Monday, 02 July 2018, 16:48 GMT
@jghodd most reporters state the issue only applies to 4.17 with at least 1 report of a 4.16 downgrade not affected by the issue [1]
@fredbezies are you certain that there is a fix in svn for the issue if you could get it to build with Qt 5.11? Have you tried Using ALA to switch back to Qt 5.10 build from svn and proved that includes a fix?

[1] https://bugs.archlinux.org/task/59091#comment170810
Comment by Frederic Bezies (fredbezies) - Monday, 02 July 2018, 17:13 GMT
As far as I know, there is still no fix for this issue. But at least making VirtualBox GUI buildable with QT5.11 will be a good thing. I tested "nightlies" from VirtualBox site and nothing changed :(
Comment by Radu Pralea (rpralea) - Monday, 02 July 2018, 17:42 GMT
I confirm, killing the Xorg process (causing it to restart) somehow solves the problem: mouse becomes usable.
Comment by Jeff Hodd (jghodd) - Monday, 02 July 2018, 18:39 GMT
@loqs I compared all vboxvideo and vboxguest kernel source files for 4.16.13 and 4.17.3 and found no changes to the code except for vbox_ttm.c - and those changes were made to conform to kernel ttm api changes, so there were no apparent changes in logic. The ttm module is a generic gpu memory manager, and if something was broken there, we'd be seeing 1) a more random pattern of things breaking and 2) different things breaking on different systems and/or distros. And we're not seeing that.

I concede that there may indeed be some other issue of compatibility between virtualbox 5.2.12 and linux 4.17.x, but since the kernel's vboxguest codebase hasn;t changed in any notable way between 4.16.13 and 4.17.3, it's doubtful that the issue is in the vboxguest/vboxvideo kernel extensions. The last Arch release of virtualbox was June 6, and that would have been after at least several days of testing by the Arch team and who knows how long it was tested by the Oracle team before it even made it to Arch testing. Linux 4.17.0 was released June 3. So what version of linux 4.17 was virtualbox 5.2.12 actually tested against? Best case, it had to have been tested/built against a release candidate. So by extension then, what version of linux-headers was virtualbox 5.2.12 built against?

That said, I do hope that the issue comes down to something like the Qt version, although Arch's version of virtualbox 5.2.12 already builds against Qt 5.11.

I'm currently doing a full build of virtualbox 5.2.12 in a fully updated Arch environment. I'll report back on whether that makes any difference.
Comment by loqs (loqs) - Monday, 02 July 2018, 18:57 GMT
@jghodd do you not believe someone affected performing a bisection between 4.16 and 4.17 rule the kernel in or out as the cause?
Comment by Jeff Hodd (jghodd) - Monday, 02 July 2018, 19:18 GMT
@loqs I believe someone *should* have performed a bisection between 4.16.and 4.17, but we have yet to hear anything at all from our Arch kernel gurus assigned to this bug. @tpowa and @heftig are the Arch folks who maintain the kernel. If they had already ruled the kernel in or out, shouldn't we have seen some feedback from them?

@tpowa?
@heftig?

Can we get some feedback from someone please?
Comment by Jan Alexander Steffens (heftig) - Monday, 02 July 2018, 19:36 GMT
I don't have time to investigate most bugs, sorry.
Comment by Jeff Hodd (jghodd) - Monday, 02 July 2018, 23:29 GMT
@heftig Thanks, Jan. I kinda figured that. I had noted a while back that you had taken over some of Tobias' work, so I'm guessing you're both busy. Also, I know from past experience if either of you were actively investigating this bug there'd be contributions from you in this comment thread.

@loqs This bug has mostly being running under the radar, so what you think should be happening might not be happening. It doesn;t seem to have piqued much interest over at Oracle's Virtualbox site either - they seem to be under the impression that nobody's running the latest versions of anything. Sooner or later, when all distros are running linux 4.17, look out fan... That's when it'll get fixed. Either that or we have to pester the Oracle folks so they'll take notice. Meanwhile, if you want to try a bisection of the kernels, please do. Be aware that about a quarter million lines of code were removed for 4.17, so bisecting 4.16 and 4.17 might return a whole lot more than you want to address.

I did a rebuild of virtualbox and my distro iso, and the problem is still there, so the issue is not related to kernel header file changes. For now, I'm at a loss. Ideas welcome...
Comment by David Thiede (davet) - Tuesday, 03 July 2018, 03:29 GMT
i just upgraded to the 5.2.14 r123301 (Qt5.6.2) release on my window10 system and the problem persists. i didn't have much hope seeing that others had tested from current source. Still the only sure way around the loss of mouse control is to blacklist the 2 related vbox modules.
Comment by Jeff Hodd (jghodd) - Tuesday, 03 July 2018, 05:08 GMT
I concur with @davet - 5.2.14 (Qt-5.11.1) still has the same issue. not sure about the module blacklisting though - tried that and (of course) got no graphics at all. am going to try an iso build without guest-utils next...
Comment by Christian Hesse (eworm) - Tuesday, 03 July 2018, 22:53 GMT
Fixed in linux 4.17.4-1.
Comment by Jeff Hodd (jghodd) - Wednesday, 04 July 2018, 00:29 GMT
Yup. Confirming this is fixed in linux 4.17.4.

Loading...