FS#52764 - [gnome-session] Frequent gnome session crashes with segfault in libgtk-3.so

Attached to Project: Community Packages
Opened by Lucas (tr4ce) - Sunday, 29 January 2017, 16:41 GMT
Last edited by Doug Newgard (Scimmia) - Saturday, 18 March 2017, 12:59 GMT
Task Type Bug Report
Category Packages
Status Closed
Assigned To Jan de Groot (JGC)
Jan Alexander Steffens (heftig)
Architecture All
Severity Low
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

Description: The past one or two weeks I encounter frequent crashes of my GNOME session due to a segfault in libgtk-3.so. After a crash the login screen (GDM) is shown again, and logging in again is possible.

Entry in journalctl:

Jan 29 17:24:54 lucas-desktop kernel: gnome-session-f[5001]: segfault at 0 ip 00007f8b05e12c79 sp 00007fffcede60b0 error 4 in libgtk-3.so.0.2200.7[7f8b05b30000+6fa000]

I've compiled gtk3 with debug symbols and with coredumpctl I obtain this stack trace:

PID: 5001 (gnome-session-f)
UID: 1000 (lucas)
GID: 1000 (lucas)
Signal: 11 (SEGV)
Timestamp: Sun 2017-01-29 17:24:54 CET (3min 10s ago)
Command Line: /usr/lib/gnome-session/gnome-session-failed --allow-logout
Executable: /usr/lib/gnome-session/gnome-session-failed
Control Group: /user.slice/user-1000.slice/session-c3.scope
Unit: session-c3.scope
Slice: user-1000.slice
Session: c3
Owner UID: 1000 (lucas)
Boot ID: 3e6a0cb47ca14c2b8c1df47bb4a25a2f
Machine ID: b93ccbbc89d84727afc3a8bf67be54da
Hostname: lucas-desktop
Storage: /var/lib/systemd/coredump/core.gnome-session-f.1000.3e6a0cb47ca14c2b8c1df47bb4a25a2f.5001.1485707094000000000000.lz4
Message: Process 5001 (gnome-session-f) of user 1000 dumped core.

Stack trace of thread 5001:
#0 0x00007f8b05e12c79 _gtk_style_provider_private_get_settings (libgtk-3.so.0)
#1 0x00007f8b05cae9b8 gtk_css_value_initial_compute (libgtk-3.so.0)
#2 0x00007f8b05cc3aa4 gtk_css_static_style_compute_value (libgtk-3.so.0)
#3 0x00007f8b05cafcec _gtk_css_lookup_resolve (libgtk-3.so.0)
#4 0x00007f8b05cc39cc gtk_css_static_style_new_compute (libgtk-3.so.0)
#5 0x00007f8b05cc3a25 gtk_css_static_style_get_default (libgtk-3.so.0)
#6 0x00007f8b05cb0652 gtk_css_node_init (libgtk-3.so.0)
#7 0x00007f8b0561b30f g_type_create_instance (libgobject-2.0.so.0)
#8 0x00007f8b055fd1fb n/a (libgobject-2.0.so.0)
#9 0x00007f8b055fec0d g_object_newv (libgobject-2.0.so.0)
#10 0x00007f8b055ff3c4 g_object_new (libgobject-2.0.so.0)
#11 0x00007f8b05ccbf0a gtk_css_widget_node_new (libgtk-3.so.0)
#12 0x00007f8b05ea9c67 gtk_widget_init (libgtk-3.so.0)
#13 0x00007f8b0561b30f g_type_create_instance (libgobject-2.0.so.0)
#14 0x00007f8b055fd1fb n/a (libgobject-2.0.so.0)
#15 0x00007f8b055fec0d g_object_newv (libgobject-2.0.so.0)
#16 0x00007f8b055ff3c4 g_object_new (libgobject-2.0.so.0)
#17 0x0000000000401db3 n/a (gnome-session-failed)
#18 0x00007f8b04f57291 __libc_start_main (libc.so.6)
#19 0x00000000004021ba n/a (gnome-session-failed)

GTK3 version: 3.22.7
I use Arc Dark as my GTK3 and shell theme.


Steps to reproduce: it's hard to pinpoint when exactly a crash occurs, but in (almost?) all cases I tried to start or open a new window of some app. But, in most cases starting nautilus, opening Evince, clicking on an URL that opens a new tab in Firefox, or any other action that opens a window works fine but sometimes my session crashes.
This task depends upon

Closed by  Doug Newgard (Scimmia)
Saturday, 18 March 2017, 12:59 GMT
Reason for closing:  Fixed
Comment by Lucas (tr4ce) - Sunday, 29 January 2017, 16:43 GMT
Oh yeah maybe useful: I also use the proprietary nVidia drivers.
Comment by Doug Newgard (Scimmia) - Sunday, 29 January 2017, 16:48 GMT
What theme?
Comment by Lucas (tr4ce) - Sunday, 29 January 2017, 16:50 GMT
Arc Dark
Comment by Doug Newgard (Scimmia) - Sunday, 29 January 2017, 17:01 GMT
Do you get the same issues with Adwaita?
Comment by Lucas (tr4ce) - Monday, 30 January 2017, 12:48 GMT
Yes, just had a crash while using the Adwaita theme.
Comment by Freddy (ekryyn) - Tuesday, 31 January 2017, 10:48 GMT
I experience the same problem as described. It first happened one or two weeks ago.
Comment by Wojciech Kwolek (irth) - Tuesday, 31 January 2017, 15:47 GMT
I experience the same issue without gnome, while using Darktable. Proprietary nVidia drivers.
I've attached the backtrace.
Comment by Jan Alexander Steffens (heftig) - Wednesday, 01 February 2017, 09:35 GMT
That darktable crash seems unrelated.

The backtrace from gnome-session-f(ailed) in the description of the bug is probably a red herring — that's the "fail whale" screen crashing, which is only started once gnome-session has already given up restarting critical session components (gnome-shell and gnome-settings-daemon).
Comment by Lucas (tr4ce) - Wednesday, 01 February 2017, 12:04 GMT
I've attached some coredumps which seem to be often co-occurring, namelijk ibus-x11 and gnome-settings-daemon.
Comment by Lucas (tr4ce) - Wednesday, 01 February 2017, 12:05 GMT
And gnome shell, they all seem to crash with the same error.
Comment by Jan Alexander Steffens (heftig) - Wednesday, 01 February 2017, 13:31 GMT
Are there no other error or CRITICAL messages preceding the segfaults?
Comment by Lucas (tr4ce) - Thursday, 02 February 2017, 13:51 GMT
Nope, I don't see any other notable messages in my journal (ran with sudo journalctl -xke). In my Xorg.0.log there isn't much strange either.
Comment by Lucas (tr4ce) - Sunday, 05 February 2017, 10:21 GMT
Here are some more coredumps from apps (firefox, Gephi) which seem to trigger an X server crash, and the output of journalctl -xke.

I did find some notable things:

Feb 05 10:08:31 lucas-desktop kernel: Calgary: detecting Calgary via BIOS EBDA area
Feb 05 10:08:31 lucas-desktop kernel: Calgary: Unable to locate Rio Grande table in EBDA - bailing!
Feb 05 10:08:31 lucas-desktop kernel: ------------[ cut here ]------------
Feb 05 10:08:31 lucas-desktop kernel: WARNING: CPU: 0 PID: 0 at drivers/iommu/dmar.c:844 warn_invalid_dmar.part.2+0x76/0x90
Feb 05 10:08:31 lucas-desktop kernel: Your BIOS is broken; DMAR reported at address 0!
BIOS vendor: American Megatrends Inc.; Ver: F7; Product Version: To be filled by O.E.M.
Feb 05 10:08:31 lucas-desktop kernel: Modules linked in:
Feb 05 10:08:31 lucas-desktop kernel: CPU: 0 PID: 0 Comm: swapper Not tainted 4.9.6-1-ARCH #1
Feb 05 10:08:31 lucas-desktop kernel: Hardware name: Gigabyte Technology Co., Ltd. Z87X-D3H/Z87X-D3H-CF, BIOS F7 08/02/2013
Feb 05 10:08:31 lucas-desktop kernel: ffffffff81a03d38 ffffffff81305440 ffffffff81a03d88 0000000000000000
Feb 05 10:08:31 lucas-desktop kernel: ffffffff81a03d78 ffffffff8107eb0b 0000034c00000001 0000000000000000
Feb 05 10:08:31 lucas-desktop kernel: ffffffff819386e3 ffffffff81d5601c ffffffff81d56058 ffffffff81a03ea0
Feb 05 10:08:31 lucas-desktop kernel: Call Trace:
Feb 05 10:08:31 lucas-desktop kernel: [<ffffffff81305440>] dump_stack+0x63/0x83
Feb 05 10:08:31 lucas-desktop kernel: [<ffffffff8107eb0b>] __warn+0xcb/0xf0
Feb 05 10:08:31 lucas-desktop kernel: [<ffffffff8107ec07>] warn_slowpath_fmt_taint+0x57/0x70
Feb 05 10:08:31 lucas-desktop kernel: [<ffffffff81b5e143>] ? early_ioremap+0x9/0xb
Feb 05 10:08:31 lucas-desktop kernel: [<ffffffff81b3cdc2>] ? __acpi_map_table+0x13/0x18
Feb 05 10:08:31 lucas-desktop kernel: [<ffffffff8143b7e6>] warn_invalid_dmar.part.2+0x76/0x90
Feb 05 10:08:31 lucas-desktop kernel: [<ffffffff816012b0>] dmar_validate_one_drhd+0xa0/0xe0
Feb 05 10:08:31 lucas-desktop kernel: [<ffffffff8143b3ba>] dmar_walk_remapping_entries+0x7a/0x190
Feb 05 10:08:31 lucas-desktop kernel: [<ffffffff81b739a6>] detect_intel_iommu+0x5f/0xf4
Feb 05 10:08:31 lucas-desktop kernel: [<ffffffff81601210>] ? xen_swiotlb_init+0x500/0x500
Feb 05 10:08:31 lucas-desktop kernel: [<ffffffff81b376ca>] pci_iommu_alloc+0x50/0x6c
Feb 05 10:08:31 lucas-desktop kernel: [<ffffffff81b469af>] mem_init+0x17/0x9d
Feb 05 10:08:31 lucas-desktop kernel: [<ffffffff81b27dc6>] start_kernel+0x22c/0x464
Feb 05 10:08:31 lucas-desktop kernel: [<ffffffff81b27120>] ? early_idt_handler_array+0x120/0x120
Feb 05 10:08:31 lucas-desktop kernel: [<ffffffff81b272d6>] x86_64_start_reservations+0x2a/0x2c
Feb 05 10:08:31 lucas-desktop kernel: [<ffffffff81b27424>] x86_64_start_kernel+0x14c/0x16f
Feb 05 10:08:31 lucas-desktop kernel: ---[ end trace 08604333801810c7 ]---

My BIOS is broken?! That doesn't sound good.

Other thing:
Feb 05 10:08:33 lucas-desktop kernel: NVRM: Your system is not currently configured to drive a VGA console
Feb 05 10:08:33 lucas-desktop kernel: NVRM: on the primary VGA device. The NVIDIA Linux graphics driver
Feb 05 10:08:33 lucas-desktop kernel: NVRM: requires the use of a text-mode VGA console. Use of other console
Feb 05 10:08:33 lucas-desktop kernel: NVRM: drivers including, but not limited to, vesafb, may result in
Feb 05 10:08:33 lucas-desktop kernel: NVRM: corruption and stability problems, and is not supported.
Feb 05 10:08:33 lucas-desktop kernel: nvidia-modeset: Allocated GPU:0 (GPU-528b2b2e-ff28-1271-5b19-1d7b7ab2c5e9) @ PCI:0000:01:00.0

I googled a bit on this error, and it's something with nVidia driver and UEFI, but I've been running in UEFI mode for years with no problems, so I don't think this would cause it.


Comment by Lucas (tr4ce) - Tuesday, 28 February 2017, 20:22 GMT
Ok, so I still have this problem. I recently managed to sort of capture another error (double free). It's a photo of my screen because I cannot find this error anywhere else (not in journalctl, not with coredumpctl). I also had to be quick because X restarts itself after a few moment (GDM login screen appears).

I suspect this is something weird in the nVidia driver, but can anyone think of some pointers on where to start looking for fixes?
Comment by Lucas (tr4ce) - Saturday, 18 March 2017, 11:02 GMT
This seems to be fixed with the new Linux kernel and/or updated nvidia driver.

Loading...