FS#53582 - Segmentation fault in libmozjs-38.so (extra/js38) in gnome-shell

Attached to Project: Arch Linux
Opened by Daniel Playfair Cal (hedgepigdaniel) - Wednesday, 05 April 2017, 23:38 GMT
Last edited by Jan de Groot (JGC) - Tuesday, 25 April 2017, 12:16 GMT
Task Type Bug Report
Category Packages: Extra
Status Closed
Assigned To No-one
Architecture x86_64
Severity High
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 3
Private No

Details

Description:

Segmentation fault in libmozjs-38.so, provided by extra/js38. Triggered by something in javascript in the taskbar shell extension. I reported a bug for the shell extension, but it seems like it is a separate bug that there is a segfault instead of a js error.

Additional info:
* gnome-shell-extension-taskbar 55 (Built from AUR package after changing the version from 53 to 55)
* Kernel 4.9.18
* js38 38.0.0-1
* gnome-shell, gnome-shell-extensions 3.24 (Built from source along with other gnome packages, no changes other than the release SHA/tag and skipping cherry-picks that have been merged)
* gjs 1.48.0+5+ga6f0735-1


Steps to reproduce:
* enable taskbar gnome extension
* start a gnome session, lock the screen, log back in, lock the screen again (with taskbar 54 it happens on the first lock)
* notice that the session has crashed

Log:
```
Apr 06 08:54:46 danielpc-arch pkexec[14412]: daniel: Executing command [USER=root] [TTY=unknown] [CWD=/home/daniel] [COMMAND=/usr/lib/gnome-settings-daemon/gsd-backlight-helper --set-brightness 56]
Apr 06 08:55:08 danielpc-arch kernel: traps: gnome-shell[13359] general protection ip:7fa9894adc8d sp:7ffd64e24730 error:0
Apr 06 08:55:08 danielpc-arch kernel: in libmozjs-38.so[7fa98909c000+725000]
Apr 06 08:55:08 danielpc-arch systemd[1]: Started Process Core Dump (PID 14422/UID 0).
Apr 06 08:55:09 danielpc-arch c[13648]: Error reading events from display: Broken pipe
Apr 06 08:55:09 danielpc-arch unknown[13639]: Error reading events from display: Broken pipe
Apr 06 08:55:09 danielpc-arch org.gnome.Shell.desktop[13359]: (EE)
Apr 06 08:55:09 danielpc-arch org.gnome.Shell.desktop[13359]: Fatal server error:
Apr 06 08:55:09 danielpc-arch org.gnome.Shell.desktop[13359]: (EE) failed to read Wayland events: Broken pipe
Apr 06 08:55:09 danielpc-arch org.gnome.Shell.desktop[13359]: (EE)
Apr 06 08:55:09 danielpc-arch polkitd[532]: Unregistered Authentication Agent for unix-session:c7 (system bus name :1.1374, object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale en_AU.utf8) (discon
Apr 06 08:55:09 danielpc-arch gnome-session[13328]: gnome-session-binary[13328]: WARNING: Application 'org.gnome.Shell.desktop' killed by signal 11
Apr 06 08:55:09 danielpc-arch terminator[13721]: Error reading events from display: Broken pipe
Apr 06 08:55:09 danielpc-arch gnome-session-binary[13328]: WARNING: Application 'org.gnome.Shell.desktop' killed by signal 11
Apr 06 08:55:09 danielpc-arch gnome-session-binary[13328]: Unrecoverable failure in required component org.gnome.Shell.desktop
Apr 06 08:55:09 danielpc-arch gsd-color[13507]: gsd-color: Fatal IO error 11 (Resource temporarily unavailable) on X server :0.
Apr 06 08:55:09 danielpc-arch org.a11y.atspi.Registry[13392]: XIO: fatal IO error 11 (Resource temporarily unavailable) on X server ":0"
Apr 06 08:55:09 danielpc-arch org.a11y.atspi.Registry[13392]: after 145 requests (145 known processed) with 0 events remaining.
Apr 06 08:55:09 danielpc-arch chromium.desktop[13816]: [13901:13901:0406/085509.433278:ERROR:x11_util.cc(88)] X IO error received (X server probably went away)
Apr 06 08:55:09 danielpc-arch gsd-media-keys[13515]: gsd-media-keys: Fatal IO error 11 (Resource temporarily unavailable) on X server :0.
Apr 06 08:55:09 danielpc-arch chromium.desktop[13816]: [13816:13816:0406/085509.433482:ERROR:chrome_browser_main_extra_parts_x11.cc(62)] X IO error received (X server probably went away)
Apr 06 08:55:09 danielpc-arch gsd-print-notif[13477]: gsd-print-notifications: Fatal IO error 11 (Resource temporarily unavailable) on X server :0.
Apr 06 08:55:09 danielpc-arch gsd-wacom[13487]: gsd-wacom: Fatal IO error 11 (Resource temporarily unavailable) on X server :0.
Apr 06 08:55:09 danielpc-arch gsd-a11y-keyboa[13504]: gsd-a11y-keyboard: Fatal IO error 11 (Resource temporarily unavailable) on X server :0.
Apr 06 08:55:09 danielpc-arch gsd-xsettings[13501]: gsd-xsettings: Fatal IO error 104 (Connection reset by peer) on X server :0.
Apr 06 08:55:09 danielpc-arch gsd-power[13474]: gsd-power: Fatal IO error 11 (Resource temporarily unavailable) on X server :0.
Apr 06 08:55:09 danielpc-arch gsd-clipboard[13506]: gsd-clipboard: Fatal IO error 11 (Resource temporarily unavailable) on X server :0.
Apr 06 08:55:09 danielpc-arch gsd-keyboard[13514]: gsd-keyboard: Fatal IO error 11 (Resource temporarily unavailable) on X server :0.
Apr 06 08:55:09 danielpc-arch gsd-xrandr[13498]: gsd-xrandr: Fatal IO error 11 (Resource temporarily unavailable) on X server :0.
Apr 06 08:55:09 danielpc-arch gsd-housekeepin[13511]: gsd-housekeeping: Fatal IO error 11 (Resource temporarily unavailable) on X server :0.
Apr 06 08:55:09 danielpc-arch pulseaudio[13398]: ICE default IO error handler doing an exit(), pid = 13398, errno = 11
Apr 06 08:55:09 danielpc-arch systemd[13311]: pulseaudio.service: Main process exited, code=exited, status=1/FAILURE
Apr 06 08:55:09 danielpc-arch systemd[13311]: pulseaudio.service: Unit entered failed state.
Apr 06 08:55:09 danielpc-arch systemd[13311]: pulseaudio.service: Failed with result 'exit-code'.
Apr 06 08:55:09 danielpc-arch gdm-password][13308]: pam_unix(gdm-password:session): session closed for user daniel
...
Apr 06 08:55:09 danielpc-arch systemd[13311]: pulseaudio.service: Service hold-off time over, scheduling restart.
Apr 06 08:55:09 danielpc-arch systemd[13311]: Stopped Sound Service.
Apr 06 08:55:09 danielpc-arch systemd[13311]: Starting Sound Service...
Apr 06 08:55:09 danielpc-arch rtkit-daemon[627]: Successfully made thread 14431 of process 14431 (/usr/bin/pulseaudio) owned by '1000' high priority at nice level -11.
Apr 06 08:55:09 danielpc-arch rtkit-daemon[627]: Supervising 1 threads of 1 processes of 1 users.
Apr 06 08:55:09 danielpc-arch pulseaudio[14431]: W: [pulseaudio] pid.c: Stale PID file, overwriting.
Apr 06 08:55:09 danielpc-arch systemd-coredump[14423]: Process 13359 (gnome-shell) of user 1000 dumped core.

Stack trace of thread 13359:
#0 0x00007fa9894adc8d n/a (libmozjs-38.so)
#1 0x00007fa990e12162 n/a (n/a)
Apr 06 08:55:11 danielpc-arch pulseaudio[14431]: E: [pulseaudio] module-alsa-card.c: Failed to open mixer for jack detection
Apr 06 08:55:11 danielpc-arch systemd[13311]: Started Sound Service.
Apr 06 08:55:11 danielpc-arch systemd-logind[477]: Removed session c7.
Apr 06 08:55:11 danielpc-arch systemd[1]: Stopping User Manager for UID 1000...
Apr 06 08:55:11 danielpc-arch tracker-store[13634]: Received signal:15->'Terminated'
Apr 06 08:55:11 danielpc-arch tracker-store[13634]: OK
```

Between version 54 and 55 the problems stopped happening on the first lock: https://github.com/zpydr/gnome-shell-extension-taskbar
This task depends upon

Closed by  Jan de Groot (JGC)
Tuesday, 25 April 2017, 12:16 GMT
Reason for closing:  Fixed
Comment by Jan de Groot (JGC) - Thursday, 06 April 2017, 08:50 GMT
Please test js38-38.8.0-1 from testing. You might need to recompile gjs though. If that's the case, please also mention as I have to do the same in gnome-unstable then.
Comment by Daniel Playfair Cal (hedgepigdaniel) - Thursday, 06 April 2017, 12:20 GMT
Same behaviour with js38.8.0-1.

There was no need but I also tried compiling the latest gjs from master (git 50651d0). The effect of that was that the segfault now happens on the first lock rather than the second (which also was the behaviour with taskbar 54 rather than 55)
Comment by Jan de Groot (JGC) - Thursday, 06 April 2017, 14:15 GMT
Can you also try building gjs 1.48.0 release instead of the git snapshot that is in gnome-unstable?
Comment by Daniel Playfair Cal (hedgepigdaniel) - Thursday, 06 April 2017, 22:54 GMT
OK I tried the 1.48 release tag in gjs, still the same thing (on the second lock). Here's another log (still with js38.8.0-1):
[code]
Apr 07 08:49:02 danielpc-arch gnome-shell[967]: JS WARNING: [resource:///org/gnome/shell/gdm/util.js 330]: reference to undefined property this._preemptingService
Apr 07 08:49:05 danielpc-arch gnome-shell[967]: WARNING: addSignalMethods is replacing existing [0x46e9340 Gjs_ShowAppsIcon.dash-item-container] connect method
Apr 07 08:49:05 danielpc-arch gnome-shell[967]: WARNING: addSignalMethods is replacing existing [0x46e9340 Gjs_ShowAppsIcon.dash-item-container] disconnect method
Apr 07 08:49:05 danielpc-arch gnome-shell[967]: WARNING: addSignalMethods is replacing existing [0x46e9340 Gjs_ShowAppsIcon.dash-item-container] emit method
Apr 07 08:49:07 danielpc-arch kernel: gnome-shell[967]: segfault at fffffffffffffffe ip 00007fdb7ca1cdd2 sp 00007ffc1c3331c0 error 5 in libmozjs-38.so[7fdb7c63f000+6cb000]
Apr 07 08:49:07 danielpc-arch systemd[1]: Created slice system-systemd\x2dcoredump.slice.
Apr 07 08:49:07 danielpc-arch systemd[1]: Started Process Core Dump (PID 1876/UID 0).
Apr 07 08:49:07 danielpc-arch redshift-gtk[1265]: Error reading events from display: Broken pipe
Apr 07 08:49:07 danielpc-arch c[1293]: Error reading events from display: Broken pipe
Apr 07 08:49:07 danielpc-arch org.gnome.Shell.desktop[967]: (EE)
Apr 07 08:49:07 danielpc-arch org.gnome.Shell.desktop[967]: Fatal server error:
Apr 07 08:49:07 danielpc-arch org.gnome.Shell.desktop[967]: (EE) failed to read Wayland events: Broken pipe
Apr 07 08:49:07 danielpc-arch org.gnome.Shell.desktop[967]: (EE)
Apr 07 08:49:07 danielpc-arch gnome-session[936]: gnome-session-binary[936]: WARNING: Application 'org.gnome.Shell.desktop' killed by signal 11
Apr 07 08:49:07 danielpc-arch gnome-session-binary[936]: Unrecoverable failure in required component org.gnome.Shell.desktop
Apr 07 08:49:07 danielpc-arch gnome-session-binary[936]: WARNING: Application 'org.gnome.Shell.desktop' killed by signal 11
Apr 07 08:49:07 danielpc-arch polkitd[403]: Unregistered Authentication Agent for unix-session:c3 (system bus name :1.76, object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale en_AU.utf8) (disconne
Apr 07 08:49:07 danielpc-arch gdm-password][916]: pam_unix(gdm-password:session): session closed for user daniel
Apr 07 08:49:07 danielpc-arch pulseaudio[1013]: ICE default IO error handler doing an exit(), pid = 1013, errno = 11
Apr 07 08:49:07 danielpc-arch chromium.desktop[1339]: [1339:1339:0407/084907.952882:ERROR:zygote_communication_linux.cc(296)] Failed to send GetTerminationStatus message to zygote
Apr 07 08:49:07 danielpc-arch chromium.desktop[1339]: [1339:1339:0407/084907.960035:ERROR:zygote_communication_linux.cc(296)] Failed to send GetTerminationStatus message to zygote
Apr 07 08:49:07 danielpc-arch systemd[919]: pulseaudio.service: Main process exited, code=exited, status=1/FAILURE
Apr 07 08:49:07 danielpc-arch chromium.desktop[1339]: [1339:1339:0407/084907.960506:ERROR:zygote_communication_linux.cc(296)] Failed to send GetTerminationStatus message to zygote
Apr 07 08:49:07 danielpc-arch systemd[919]: pulseaudio.service: Unit entered failed state.
Apr 07 08:49:07 danielpc-arch systemd[919]: pulseaudio.service: Failed with result 'exit-code'.
Apr 07 08:49:07 danielpc-arch chromium.desktop[1339]: [1339:1339:0407/084907.963401:ERROR:chrome_browser_main_extra_parts_x11.cc(62)] X IO error received (X server probably went away)
Apr 07 08:49:07 danielpc-arch org.a11y.atspi.Registry[1007]: XIO: fatal IO error 11 (Resource temporarily unavailable) on X server ":0"
Apr 07 08:49:07 danielpc-arch org.a11y.atspi.Registry[1007]: after 203 requests (203 known processed) with 0 events remaining.
Apr 07 08:49:07 danielpc-arch chromium.desktop[1339]: [1339:1339:0407/084907.964003:ERROR:zygote_communication_linux.cc(296)] Failed to send GetTerminationStatus message to zygote
Apr 07 08:49:07 danielpc-arch chromium.desktop[1339]: [1339:1339:0407/084907.964473:ERROR:zygote_communication_linux.cc(296)] Failed to send GetTerminationStatus message to zygote
Apr 07 08:49:07 danielpc-arch chromium.desktop[1339]: [1339:1339:0407/084907.964854:ERROR:zygote_communication_linux.cc(296)] Failed to send GetTerminationStatus message to zygote
Apr 07 08:49:07 danielpc-arch chromium.desktop[1339]: [1339:1339:0407/084907.965210:ERROR:zygote_communication_linux.cc(296)] Failed to send GetTerminationStatus message to zygote
Apr 07 08:49:07 danielpc-arch chromium.desktop[1339]: [1339:1339:0407/084907.965483:ERROR:zygote_communication_linux.cc(296)] Failed to send GetTerminationStatus message to zygote
Apr 07 08:49:08 danielpc-arch systemd-coredump[1877]: Process 967 (gnome-shell) of user 1000 dumped core.

Stack trace of thread 967:
#0 0x00007fdb7ca1cdd2 n/a (libmozjs-38.so)
#1 0x00007fdb842dd162 n/a (n/a)
[/code]
Comment by Cedric Bellegarde (gnumdk) - Tuesday, 11 April 2017, 05:41 GMT
Happen without extensions too...
(gdb) backtrace
#0 0x00007f49bad29a10 in raise () at /usr/lib/libc.so.6
#1 0x00007f49bad2b13a in abort () at /usr/lib/libc.so.6
#2 0x00007f49bad682b0 in __libc_message () at /usr/lib/libc.so.6
#3 0x00007f49bad6e90e in malloc_printerr () at /usr/lib/libc.so.6
#4 0x00007f49bad6f11e in _int_free () at /usr/lib/libc.so.6
#5 0x00007f49bd1f3933 in () at /usr/lib/libgjs.so.0
#6 0x00007f49b642fa83 in () at /usr/lib/libmozjs-38.so
#7 0x00007f49b648ad7c in () at /usr/lib/libmozjs-38.so
#8 0x00007f49b6430d51 in () at /usr/lib/libmozjs-38.so
#9 0x00007f49b644679a in () at /usr/lib/libmozjs-38.so
#10 0x00007f49b64471dc in () at /usr/lib/libmozjs-38.so
#11 0x00007f49b6448fb5 in () at /usr/lib/libmozjs-38.so
#12 0x00007f49b64499b6 in () at /usr/lib/libmozjs-38.so
#13 0x00007f49b6449bd7 in () at /usr/lib/libmozjs-38.so
#14 0x00007f49b6449fd4 in () at /usr/lib/libmozjs-38.so
#15 0x00007f49bd207cd9 in gjs_schedule_gc_if_needed () at /usr/lib/libgjs.so.0
#16 0x00007f49bd207d44 in gjs_call_function_value () at /usr/lib/libgjs.so.0
#17 0x00007f49bd1e3015 in gjs_closure_invoke () at /usr/lib/libgjs.so.0
#18 0x00007f49bd1fabbc in () at /usr/lib/libgjs.so.0
#19 0x00007f49bb5dbf75 in g_closure_invoke () at /usr/lib/libgobject-2.0.so.0
#20 0x00007f49bb5edf82 in () at /usr/lib/libgobject-2.0.so.0
#21 0x00007f49bb5f6bdc in g_signal_emit_valist () at /usr/lib/libgobject-2.0.so.0
#22 0x00007f49bb5f6fbf in g_signal_emit () at /usr/lib/libgobject-2.0.so.0
#23 0x00007f49bcacc69a in meta_stack_tracker_sync_stack () at /usr/lib/libmutter-0.so.0
#24 0x00007f49bcacc6e9 in () at /usr/lib/libmutter-0.so.0
#25 0x00007f49bcacd783 in () at /usr/lib/libmutter-0.so.0
#26 0x00007f49bbe56c04 in () at /usr/lib/mutter/libmutter-clutter-0.so
#27 0x00007f49bbe578b7 in () at /usr/lib/mutter/libmutter-clutter-0.so
#28 0x00007f49bb302797 in g_main_context_dispatch () at /usr/lib/libglib-2.0.so.0
#29 0x00007f49bb302a00 in () at /usr/lib/libglib-2.0.so.0
#30 0x00007f49bb302d22 in g_main_loop_run () at /usr/lib/libglib-2.0.so.0
#31 0x00007f49bcac1b3c in meta_run () at /usr/lib/libmutter-0.so.0
#32 0x0000000000401ff7 in main ()
Comment by Adam Kürthy (adee) - Tuesday, 11 April 2017, 20:46 GMT
I rebuilt libmozjs-38 with -Wno-stack-protector and so far no crashes.
Not sure if this is the cause or I'm simply lucky. It crashed rarely.

I got the idea from:
https://bugs.launchpad.net/ubuntu/+source/mozjs38/+bug/1668858
Comment by Daniel Playfair Cal (hedgepigdaniel) - Tuesday, 11 April 2017, 23:40 GMT
Isn't -Wxxx just enabling a warning and can't have any effect other than failing to build if -Werror is enabled? I tried with -fno-stack-protector added to CXXFLAGS in the PKGBUILD (js38.0.0-1). Still crashes in the same way, although this time only on the third lock (record!!!!!)
Comment by Jan de Groot (JGC) - Wednesday, 12 April 2017, 08:55 GMT Comment by Daniel Playfair Cal (hedgepigdaniel) - Wednesday, 12 April 2017, 10:51 GMT
About 10 locks in and it seems to have fixed it :)
Comment by Adam Kürthy (adee) - Wednesday, 19 April 2017, 21:21 GMT
After I updated to extra/js38 38.8.0-2 it started crashing again.
Before I used a package built by myself. My previous description was inaccurate, sorry.
What I did was replacing the CFLAGS and CXXFLAGS in the PKGBUILD with:
CXXFLAGS='-march=x86-64 -mtune=generic -O2 -pipe -fno-delete-null-pointer-checks -Wno-stack-protector'
CFLAGS='-march=x86-64 -mtune=generic -O2 -pipe -fno-delete-null-pointer-checks -Wno-stack-protector'

(Removed the makepkg default -fstack-protector-strong and added -Wno-stack-protector)

With this it ran for almost a day. With extra/js38 38.8.0-2 installed and rebooted it crashed in an hour.

I can upload a stack trace after re-opening the issue.
Comment by Jan de Groot (JGC) - Wednesday, 19 April 2017, 21:56 GMT
Please try https://pkgbuild.com/~jgc/js38-38.8.0-3-x86_64.pkg.tar.xz

This version removes mozjs38-1269317.patch and adds -D_FORTIFY_SOURCE=2 -O2 to CPPFLAGS and -Wl,-z,now to LDFLAGS. These hardening flags are also used by Firefox, so I see no reason why compiling with -fstack-protector-strong should kill the library.


Comment by Adam Kürthy (adee) - Thursday, 20 April 2017, 10:44 GMT
I'm running the test version now. So far it's working for 3 hours. I need more time to test.

In the meantime I found the upstream bug report for my issue: https://bugzilla.gnome.org/show_bug.cgi?id=781194
This may be of interest for us.
Comment by Adam Kürthy (adee) - Thursday, 20 April 2017, 13:59 GMT
Now that I'm looking at the situation it's probably solved by the gjs update a few hours ago. That should include the fix from the bugreport.
No crashes here so far.
Comment by Cedric Bellegarde (gnumdk) - Thursday, 20 April 2017, 17:46 GMT
For me the bug is fixed by https://bugzilla.gnome.org/show_bug.cgi?id=781194 patch
Comment by Adam Kürthy (adee) - Saturday, 22 April 2017, 10:37 GMT
I reverted to extra/js38 38.8.0-2. As far as I'm concerned this problem is fixed, no further action required.
This bug can be closed.

Thanks.

Loading...