FS#37432 - linux 3.11.6 and Virtualbox 4.3.0
Attached to Project:
Community Packages
Opened by Jamp (jamp) - Monday, 21 October 2013, 14:52 GMT
Last edited by Sébastien Luttringer (seblu) - Saturday, 16 November 2013, 12:02 GMT
Opened by Jamp (jamp) - Monday, 21 October 2013, 14:52 GMT
Last edited by Sébastien Luttringer (seblu) - Saturday, 16 November 2013, 12:02 GMT
|
Details
Description:
After upgrading to linux 3.11.6 and Virtualbox 4.3.0 my windows xp virtualized machine does not work anymore. Sometime I get BSODs, sometime it starts but is really slow and eats 100% CPU. This is very very bad because there are people who *work* with this systems. Additional info: * package version(s) * config and/or log files etc. Steps to reproduce: |
This task depends upon
This task blocks these from closing
FS#37850 - [VirtualBox] unable to boot any Windows Vista or 7
VMs.
FS#37872 - [virtualbox] 4.3.2 100% CPU
Closed by Sébastien Luttringer (seblu)
Saturday, 16 November 2013, 12:02 GMT
Reason for closing: Upstream
Saturday, 16 November 2013, 12:02 GMT
Reason for closing: Upstream
upd:
I have BSOD now. Something is really wrong.
The system(hardware) is working usually extremely stable since a year (industrial i7 motherboard) so I would exclude
new hardware issues.
History:
- System upgraded using pacman -Syu
- system rebooted
- started xfce4
- virtualbox started (4.3.latest) running with one Windows7 + latest virtualbox host modules installed in it + rebooted Win7.
- work for some hours...
- music was playing in vlc
- was reading an pdf
- sound stopped, mouse froze.
- login via ssh possible
- dmesg shows kernel panic
- shutdown -r now works.
- the kernel panic shows:
Oct 21 18:36:57 octo kernel: [16801.121381] PGD 4270f2067 PUD 42746d067 PMD 0
Oct 21 18:36:57 octo kernel: [16801.122107] Oops: 0000 [#1] PREEMPT SMP
Oct 21 18:36:57 octo kernel: [16801.122839] Modules linked in: fuse usb_storage pl2303 usbserial joydev hid_logitech_dj usbhid hid nfsd rpcsec_gss_krb5 auth_rpcgss nfs_acl oid_regi
stry nfsv4 hwmon_vid snd_hda_codec_realtek raid1 md_mod coretemp kvm_intel kvm crc32c_intel ppdev evdev gpio_ich iTCO_wdt iTCO_vendor_support microcode nouveau psmouse serio_raw pc
spkr mxm_wmi i7core_edac wmi edac_core i2c_i801 video ttm drm_kms_helper drm i2c_algo_bit i2c_core snd_hda_intel snd_hda_codec parport_pc parport snd_hwdep snd_pcm e1000e snd_page_
alloc thermal ptp snd_timer acpi_cpufreq mei_me snd fan mei mperf pps_core button soundcore lpc_ich shpchp processor pci_stub vboxpci(O) vboxnetflt(O) vboxnetadp(O) vboxdrv(O) nfs
lockd sunrpc fscache ext4 crc16 mbcache jbd2 sd_mod firewire_ohci uhci_hcd ehci_pci ata_generic ahci libahci firewire_core pata_acpi ehci_hcd libata crc_itu_t usbcore usb_common sc
si_mod
Oct 21 18:36:57 octo kernel: [16801.126268] CPU: 1 PID: 838 Comm: X Tainted: G O 3.11.6-1-ARCH #1
Oct 21 18:36:57 octo kernel: [16801.127156] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./To be filled by O.E.M., BIOS 080015 05/24/2010
Oct 21 18:36:57 octo kernel: [16801.128081] task: ffff8804271eb2a0 ti: ffff88042862a000 task.ti: ffff88042862a000
Oct 21 18:36:57 octo kernel: [16801.129014] RIP: 0010:[<ffffffffa08a1d52>] [<ffffffffa08a1d52>] nouveau_fence_wait_uevent.isra.1+0x22/0x440 [nouveau]
Oct 21 18:36:57 octo kernel: [16801.129963] RSP: 0018:ffff88042862bc48 EFLAGS: 00010296
Oct 21 18:36:57 octo kernel: [16801.130915] RAX: 0000000000000000 RBX: ffff8802216359e8 RCX: 0000000000000001
Oct 21 18:36:57 octo kernel: [16801.131865] RDX: 0000000000000001 RSI: ffff8802216359f0 RDI: ffff8802216359e8
Oct 21 18:36:57 octo kernel: [16801.132808] RBP: ffff88042862bcc8 R08: 0000000000000346 R09: 000000000000e200
Oct 21 18:36:57 octo kernel: [16801.133741] R10: ffffffffa0909f80 R11: ffff88042862be10 R12: 0000000000000001
Oct 21 18:36:57 octo kernel: [16801.134676] R13: 0000000000000000 R14: ffff8804272e18a0 R15: ffff8802216359f0
Oct 21 18:36:57 octo kernel: [16801.135614] FS: 00007f6ba6f92880(0000) GS:ffff88043fc40000(0000) knlGS:0000000000000000
Oct 21 18:36:57 octo kernel: [16801.136545] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Oct 21 18:36:57 octo kernel: [16801.137485] CR2: 0000000000000008 CR3: 00000004278e7000 CR4: 00000000000027e0
Oct 21 18:36:57 octo kernel: [16801.138432] Stack:
Oct 21 18:36:57 octo kernel: [16801.139355] ffff88042862be10 ffffffffa0909f80 000000000000e200 0000000000000346
Oct 21 18:36:57 octo kernel: [16801.140297] 0000000000400000 0000000000000001 0000000000000001 ffff8802216359f0
Oct 21 18:36:57 octo kernel: [16801.141246] ffff8802216359e8 ffffffffffffffae ffffffffa08a1d39 ffff8802216359c0
Oct 21 18:36:57 octo kernel: [16801.142192] Call Trace:
Oct 21 18:36:57 octo kernel: [16801.143110] [<ffffffffa08a1d39>] ? nouveau_fence_wait_uevent.isra.1+0x9/0x440 [nouveau]
Oct 21 18:36:57 octo kernel: [16801.144035] [<ffffffffa08a21f6>] nouveau_fence_wait+0x86/0x1a0 [nouveau]
Oct 21 18:36:57 octo kernel: [16801.144989] [<ffffffffa08a3de5>] nouveau_bo_fence_wait+0x15/0x20 [nouveau]
Oct 21 18:36:57 octo kernel: [16801.145937] [<ffffffffa07c7911>] ttm_bo_wait+0x91/0x190 [ttm]
Oct 21 18:36:57 octo kernel: [16801.146911] [<ffffffffa08a90d9>] nouveau_gem_ioctl_cpu_prep+0x59/0xc0 [nouveau]
Oct 21 18:36:57 octo kernel: [16801.147877] [<ffffffffa07651a2>] drm_ioctl+0x532/0x660 [drm]
Oct 21 18:36:57 octo kernel: [16801.148856] [<ffffffff811b57e5>] ? d_free+0x55/0x60
Oct 21 18:36:57 octo kernel: [16801.149833] [<ffffffff8108bc74>] ? lg_global_unlock+0x44/0x90
Oct 21 18:36:57 octo kernel: [16801.150816] [<ffffffff811bdec0>] ? mntput_no_expire+0x100/0x150
Oct 21 18:36:57 octo kernel: [16801.151815] [<ffffffff811b1c05>] do_vfs_ioctl+0x2e5/0x4d0
Oct 21 18:36:57 octo kernel: [16801.152812] [<ffffffff811a15ee>] ? ____fput+0xe/0x10
Oct 21 18:36:57 octo kernel: [16801.153806] [<ffffffff81080744>] ? task_work_run+0xa4/0xe0
Oct 21 18:36:57 octo kernel: [16801.154814] [<ffffffff811b1e71>] SyS_ioctl+0x81/0xa0
Oct 21 18:36:57 octo kernel: [16801.155816] [<ffffffff814ea5dd>] system_call_fastpath+0x1a/0x1f
Oct 21 18:36:57 octo kernel: [16801.156834] Code: c3 0f 1f 84 00 00 00 00 00 66 66 66 66 90 55 48 89 e5 41 57 49 89 f7 41 56 41 55 41 54 41 89 d4 53 48 89 fb 48 83 ec 58 48 8b 07 <
48> 8b 48 08 48 8b 91 f8 00 00 00 4c 8b b1 c8 07 00 00 48 8b 42
Oct 21 18:36:57 octo kernel: [16801.159045] RSP <ffff88042862bc48>
Oct 21 18:36:57 octo kernel: [16801.160077] CR2: 0000000000000008
Oct 21 18:36:57 octo kernel: [16801.168404] ---[ end trace b8438271a457747e ]---
Oct 21 18:36:57 octo kernel: [16801.400646] VirtualBox[5690]: segfault at 3f1527f0 ip 00007f19aae31e0d sp 00007fffe7c33c70 error 4 in libc-2.18.so[7f19aadfa000+1a2000]
Regards,
Clemens
I for one, managed to freeze my system after upgrading to virtualbox 4.3 and downgrading resolved it.
I tried to downgrade both the kernel and VirtualBox but this didn't solve. Maybe I made some mistakes since I was in a hurry and eventually I chose to upgrade everything again. (Btw, I upgraded and downgraded also the guest additions).
This problem hurted me very bad.. I almost lost my workday... Maybe Arch is too bleeding edge to be used at work.
I have virtualbox 4.2.18, archlinux ( under windows xp host ), kernel 3.11.6, virtualboxguest additions 4.3.0 and all is ok,
you have to check if all components of virtualbox 4.3.0 are really uninstalled, then reinstall virtualbox 4.2.18,
maybe the bug occurs only on windows guests under virtualbox 4.3.0
IMHO VirtualBox has problems with both because the process is always eating 100%+ of cpu, but the Win7 virtualization works while the XP one does not.
The XP virtualization BSODs or, if starts is painfully slow. Unfortunately I need to use some software programs that work with XP only :(
one workaround is to use "wine" if you have win32 software :
https://www.archlinux.org/packages/community/i686/wine/
Maybe it depends on the underlying emulated hardware..
The facts are that "before" it worked and "after" it does not. Nothing has changed
meanwhile besides the effects of the usual "pacman -Suy".
This means that something has been broken somewhere.
I went back to a previous kernel and virtualbox version and it works again.
Wine is not a serious alternative for my needs. Maybe qemu could be.
I reported the crash to Oracle/virtualbox.org.
A downgrade to virtualbox-4.2.18 in the meanwhile saves my day.
Regards,
Clemens
First one right after pacman -Suy, rmmod virtualbox modules + modbrobe, with old kernel running, so probably it's sole VB's fault and not dependent on kernel version.
BTW, guest OS (win7) works fine for me, before the host deadlocks.
Is there an upstream bug report in virtualbox?
Edit: downgrading virtualbox to 4.2.18 seems to work for me, whole day without problems.
https://db.tt/Ceb9jSle
I'm now running the LTS kernel to see if that helps.
Has this problem been reported upstream? is there a URL of the bug-report to follow it's progress?
Edit: Downgrading to 4.2.18 still works as a workaround. :)
Regards.
Does someones tryed with virtualbox-bin to "definitely" exclude archlinux of root causes?
with the version from the official repos the crash occured ramdonly, but never ran for more than a few minutes.
i tested with and without the extension pack (to discard that).
Argh... I cannot find my upstream bug-report anymore in oracle's crazy system
and then I ran out of time.
I was moving from 4.3.0-1 to 4.2.18 back and the crashes disappeared.
Then from 4.2.18 to 4.3.0-2 and the crashes appeared again but they are everytime
completely different. Here are some (more or less random) examples of my kernel.log
of an otherwise calm/ very stable system when virtualbox is not running:
("very stable" means something like 0-1 crashes in the last year with >10h/day intensive load.)
Oct 27 13:29:32 octo kernel: [11534.063802] warning: `VirtualBox' uses 32-bit capabilities (legacy support in use)
Oct 27 19:22:28 octo kernel: [32726.749340] nouveau E[ PFIFO][0000:01:00.0] DMA_PUSHER - ch 2 [X[932]] get 0x00200176bc put 0x00200176d4 ib_get 0x0000012c ib_put 0x00000146 state 0x80000064 (err: INVALID_CMD) push 0x00406040
Oct 27 19:22:28 octo kernel: [32726.751727] nouveau E[ PFIFO][0000:01:00.0] CACHE_ERROR - ch 2 [X[932]] subc 0 mthd 0x0060 data 0xffffffff
Oct 27 22:02:25 octo kernel: [42331.077348] nouveau E[ PFIFO][0000:01:00.0] CACHE_ERROR - ch 7 [glhanoi[13228]] subc 3 mthd 0x1b00 data 0x00000000
Oct 27 23:56:04 octo kernel: [49155.975882] nfsd: last server has exited, flushing export cache
[ 1697.591376] perf samples too long (2509 > 2500), lowering kernel.perf_event_max_sample_rate to 50100
Oct 28 10:24:47 octo kernel: [ 2911.744468] warning: `VirtualBox' uses 32-bit capabilities (legacy support in use)
Oct 28 11:56:24 octo kernel: [ 8413.181889] perf samples too long (5085 > 4990), lowering kernel.perf_event_max_sample_rate to 25200
Oct 29 11:56:43 octo kernel: [ 4138.485318] nouveau E[ PFIFO][0000:01:00.0] CACHE_ERROR - ch 4 [boing[4729]] subc 0 mthd 0x0060 data 0xbeef0201
Oct 29 11:57:50 octo kernel: [ 4206.086899] perf samples too long (2504 > 2500), lowering kernel.perf_event_max_sample_rate to 50100
Oct 29 15:04:44 octo kernel: [15428.370834] perf samples too long (5129 > 4990), lowering kernel.perf_event_max_sample_rate to 25200
Oct 30 00:13:10 octo kernel: [48360.413503] gedit[25092]: segfault at 20 ip 00007faef72ffd28 sp 00007fff9eab3280 error 4 in libgtk-3.so.0.1000.2[7faef712f000+4fa000]
Oct 30 00:13:10 octo systemd-coredump[25103]: Process 25092 (gedit) dumped core.
Oct 30 10:43:51 octo kernel: [ 1855.149564] perf samples too long (2687 > 2500), lowering kernel.perf_event_max_sample_rate to 50100
Oct 30 12:38:16 octo kernel: [ 8725.236259] warning: `VirtualBox' uses 32-bit capabilities (legacy support in use)
Oct 30 22:04:15 octo kernel: [42711.417743] Uhhuh. NMI received for unknown reason 2c on CPU 3.
Oct 30 22:04:15 octo kernel: [42711.417747] Do you have a strange power saving mode enabled?
Oct 30 22:04:15 octo kernel: [42711.417748] Dazed and confused, but trying to continue
Oct 31 01:00:26 octo kernel: [ 625.690457] warning: `VirtualBox' uses 32-bit capabilities (legacy support in use)
Oct 31 01:12:16 octo kernel: [ 1336.315669] perf samples too long (2503 > 2500), lowering kernel.perf_event_max_sample_rate to 50100
Oct 31 01:39:20 octo kernel: [ 2961.507434] Uhhuh. NMI received for unknown reason 3c on CPU 2.
Oct 31 01:39:20 octo kernel: [ 2961.507441] Do you have a strange power saving mode enabled?
Oct 31 01:39:20 octo kernel: [ 2961.507445] Dazed and confused, but trying to continue
Oct 31 02:04:48 octo kernel: [ 4490.621068] Uhhuh. NMI received for unknown reason 3c on CPU 3.
Oct 31 02:04:48 octo kernel: [ 4490.621071] Do you have a strange power saving mode enabled?
Oct 31 02:04:48 octo kernel: [ 4490.621072] Dazed and confused, but trying to continue
Oct 31 03:51:59 octo kernel: [10926.983634] nfsd: last server has exited, flushing export cache
Oct 31 10:57:45 octo kernel: [ 3904.370925] perf samples too long (2508 > 2500), lowering kernel.perf_event_max_sample_rate to 50100
Oct 31 11:03:03 octo kernel: [ 4222.502617] md: md0: resync done.
Oct 31 11:03:03 octo kernel: [ 4222.625040] RAID1 conf printout:
Oct 31 11:03:03 octo kernel: [ 4222.625048] --- wd:2 rd:2
Oct 31 11:03:03 octo kernel: [ 4222.625054] disk 0, wo:0, o:1, dev:sdb1
Oct 31 11:03:03 octo kernel: [ 4222.625059] disk 1, wo:0, o:1, dev:sdc1
Oct 31 11:26:36 octo kernel: [ 5636.890600] warning: `VirtualBox' uses 32-bit capabilities (legacy support in use)
Oct 31 14:22:52 octo kernel: [16220.530818] perf samples too long (5041 > 4990), lowering kernel.perf_event_max_sample_rate to 25200
Oct 31 21:16:00 octo kernel: [41028.492697] nouveau E[ PFIFO][0000:01:00.0] CACHE_ERROR - ch 4 [noof[12133]] subc 0 mthd 0x0060 data 0xbeef0201
Oct 31 21:17:41 octo kernel: [41129.626619] nouveau E[ PFIFO][0000:01:00.0] CACHE_ERROR - ch 4 [noof[12133]] subc 0 mthd 0x0060 data 0xbeef0201
Nov 1 00:43:26 octo kernel: [53484.235738] nfsd: last server has exited, flushing export cache
Well, if 4.3.2 doesn't improve the situation, I 'll have to get more serious in debugging.
My time is just very limited on this.
Regards,
Clemens
I want to know if the problem only occurs on 64 bits CPU,
because I use archlinux on a laptop ( pentium 4 2.4 Ghz, 32 bits CPU ) and I don't notice bugs with virtualbox 4.3.2 ( Windows XP guest ),
one interesting test is to reinstall windows XP under virtualbox 4.3.2, maybe the VM files are corrupted for those who have problems
Indeed i'm using a laptop wiht 64bits CPU, an Intel i3 370M.
Sadly i can't reinstall the Guest SO. Though is not Windows XP, is Windows 2003 Server.
Maybe i can try later with a fresh install of Windows XP.
Regards.
PS:
Maybe somebody can leave some tips on how to get the text of the kernel panic (sorry, right now i don't know how to get it), to have a little more info.
Host: i7 860 2.8GHz CPU, arch-64bit 16GB RAM, nVidia NVS290 graphics card in an office workstation.
Frequency stepping down to 1.2GHz
Virtual Machine: Windows 7 Ultimate, no 3D acceleration.
Virtual CPUs: was 8 when it crashes. (Ok, too much, but it didn't crash with 4.2.18!)
- some severe crashes with 4.3.0, about once / twice per day, after several hours of intensive CAD work
in the VM
- no crashes with 4.2.18
- no crashes yet with 4.3.2 and 2 virtual CPUs, but testing time is < 1 day.
I cannot do stress tests as I cannot risk crashing the host + client systems.
Having an increased workload on the host - like VLC playing audio/video seems
to increase the risk of a crash of the VM. But that's close to guessing.
Regards,
Clemens
I haven't tested the system thoroughly yet but the first impression is that it seems
a bit slower than before.
Upd.
It's OK for me.
Hi !
I have the seem problem.
But, i have cleaned my pacman cache before this problem. Shame !
Where i can download the "4.2.18" version ?
Thanks.
https://seblu.net/a/arm/2013/10/13/pool/community/virtualbox-4.2.18-2-i686.pkg.tar.xz
So i think is maybe related to the archlinux package.
3.10.18-1-lts - works
3.11.7-1-ck - total freeze
3.12.0-1-ARCH - total freeze
Since it works with LTS I suppose something changed with later kernels.
linux-3.11.6-1-ARCH
virtualbox-4.3.2-1
There were three Windows 7 VMs running at the same time for about
an hour with lots of file system access. The host was otherwise idle.
Hardware: i7 860 CPU, Intel Chipset, Graphics: nVidia NVS290
All Logs were empty. I enabled SysRQ now and hope to gather more
information next time it happens.
Is there a way to trigger the freeze?
1. Win* guest crash with vbox 4.3.x and kernel 3.12.x
2. I'm not affected by this on my two workstations. I'm unable to reproduce.
3. Still no bug opened upstream by people having this issue.
4. Downgrading vbox to 4.2.18 fix the issue.
5. Downgrading kernel to 3.10.x LTS fix the issue.
6. Switching to virtualbox-bin 4.3.0 seems to avoid the issue.
7. vbox 4.3.2 fix BSOD with WinXP guests.
8. Not related to expansion pack.
- 4 and 5 suggest to me that this issue is upstream related.
- 6 suggests to me that this issue may be related to our way of building vbox (in spite of being confirmed by only one user).
- 4,5,6,7 together suggest to me that there is maybe multiples issues in the same BR.
Now, please, someone with this issue, open an upstream bug report and supply informations to fix this. We are going nowhere here.
https://forums.virtualbox.org/viewtopic.php?f=7&t=58022 (arch only)
https://forums.virtualbox.org/viewtopic.php?f=7&t=57842 (also gentoo)
The bug is very rare here (and I risk a lot of data) (<1/day).
Is somebody able to trigger the bug directly?