FS#52129 - [openssh] segfault
Attached to Project:
Arch Linux
Opened by Felix Krohn (kro) - Monday, 12 December 2016, 14:19 GMT
Last edited by Gaetan Bisson (vesath) - Tuesday, 17 January 2017, 22:05 GMT
Opened by Felix Krohn (kro) - Monday, 12 December 2016, 14:19 GMT
Last edited by Gaetan Bisson (vesath) - Tuesday, 17 January 2017, 22:05 GMT
|
Details
Description: openssh dies on a fresh install of ArchLinux
# pacman -Ss openssh core/openssh 7.3p1-2 [installed] Free version of the SSH connectivity tools # /usr/sbin/sshd -D Segmentation fault # ldd /usr/bin/sshd /usr/bin/bash: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8) linux-vdso.so.1 (0x00007ffee76c9000) libpam.so.0 => /usr/lib/libpam.so.0 (0x00007f383a0e0000) libcrypto.so.1.0.0 => /usr/lib/libcrypto.so.1.0.0 (0x00007f3839c68000) libutil.so.1 => /usr/lib/libutil.so.1 (0x00007f3839a65000) libz.so.1 => /usr/lib/libz.so.1 (0x00007f383984f000) libcrypt.so.1 => /usr/lib/libcrypt.so.1 (0x00007f3839617000) libgssapi_krb5.so.2 => /usr/lib/libgssapi_krb5.so.2 (0x00007f38393c9000) libkrb5.so.3 => /usr/lib/libkrb5.so.3 (0x00007f38390e4000) libc.so.6 => /usr/lib/libc.so.6 (0x00007f3838d46000) libdl.so.2 => /usr/lib/libdl.so.2 (0x00007f3838b42000) libk5crypto.so.3 => /usr/lib/libk5crypto.so.3 (0x00007f3838911000) libcom_err.so.2 => /usr/lib/libcom_err.so.2 (0x00007f383870d000) libkrb5support.so.0 => /usr/lib/libkrb5support.so.0 (0x00007f3838500000) libkeyutils.so.1 => /usr/lib/libkeyutils.so.1 (0x00007f38382fc000) libresolv.so.2 => /usr/lib/libresolv.so.2 (0x00007f38380e5000) /lib64/ld-linux-x86-64.so.2 (0x00007f383a2ee000) libpthread.so.0 => /usr/lib/libpthread.so.0 (0x00007f3837ec8000) Additional info: * complete strace output ("strace /usr/sbin/sshd -D") in attachment Steps to reproduce: - install most recent Archlinux with openssh |
This task depends upon
Closed by Gaetan Bisson (vesath)
Tuesday, 17 January 2017, 22:05 GMT
Reason for closing: No response
Tuesday, 17 January 2017, 22:05 GMT
Reason for closing: No response
(see attached screenshot)
I'm adding our glibc experts in case they can shed any lights on this.
Related to
FS#51709maybe?- I tried recompiling openssh and for good measure also glibc - no change
- I'm currently trying this on a "Intel(R) Xeon(R) CPU E5-1620 v2 @ 3.70GHz", but have the same issue across many different hardware.
- I had only few processes with the same issues: strace itself, sshd and a couple 'ld' while recompiling glibc
- systemd-coredump output:
# strace /usr/sbin/sshd -D 2>ssh.strace
Segmentation fault (core dumped)
Dec 14 18:54:51 ns229132 systemd[1]: Started Process Core Dump (PID 30677/UID 0).
Dec 14 18:54:51 ns229132 systemd[1]: Started Process Core Dump (PID 30681/UID 0).
Dec 14 18:54:51 ns229132 systemd-coredump[30682]: Resource limits disable core dumping for process 30674 (strace).
Dec 14 18:54:51 ns229132 systemd-coredump[30682]: Process 30674 (strace) of user 0 dumped core.
Dec 14 18:54:51 ns229132 systemd-coredump[30678]: Process 30676 (sshd) of user 0 dumped core.
Stack trace of thread 30676:
#0 0x00007f11064bf107 __memset_sse2_unaligned_erms (libc.so.6)
#1 0x0000556394151c98 n/a (sshd)
#2 0x00007f110645b291 __libc_start_main (libc.so.6)
#3 0x000055639415461a n/a (sshd)
- I'm attaching the dump file to this thread accordingly
- I'm now trying out some of the hints given in in
FS#51709(nsswitch.conf, systemd-resolve), but no luck so far- I'm using a dropbear sshd to access the server in question and can provide access if helpful.
Same for me, testing/openssh doesn't change anything.
explicit_bzero(privsep_pw->pw_passwd,
strlen(privsep_pw->pw_passwd));
in sshd.c, line 1643.
Valgrind result:
==18568== Process terminating with default action of signal 11 (SIGSEGV)
==18568== Bad permissions for mapped region at address 0x40B1E35
==18568== at 0x6B74107: __memset_sse2_unaligned_erms (in /usr/lib/libc-2.24.so)
==18568== by 0x1129A1: main (sshd.c:1643)
This behavior is easily replicable with simple application, which I attach along with /proc/cpuinfo dump to this comment.
I hope this will help
cpuinfo.txt (13.2 KiB)
extra/intel-ucode 20161104-1 [installed]
Microcode update files for Intel CPUs
Microcode update is being applied at boot:
[ 0.892448] microcode: CPU0 sig=0x206d7, pf=0x1, revision=0x710
[ 0.892495] microcode: CPU1 sig=0x206d7, pf=0x1, revision=0x710
[ 0.892544] microcode: CPU2 sig=0x206d7, pf=0x1, revision=0x710
[ 0.892555] microcode: CPU3 sig=0x206d7, pf=0x1, revision=0x710
[ 0.892604] microcode: CPU4 sig=0x206d7, pf=0x1, revision=0x710
[ 0.892651] microcode: CPU5 sig=0x206d7, pf=0x1, revision=0x710
[ 0.892697] microcode: CPU6 sig=0x206d7, pf=0x1, revision=0x710
[ 0.892745] microcode: CPU7 sig=0x206d7, pf=0x1, revision=0x710
[ 0.892795] microcode: CPU8 sig=0x206d7, pf=0x1, revision=0x710
[ 0.892812] microcode: CPU9 sig=0x206d7, pf=0x1, revision=0x710
[ 0.892860] microcode: CPU10 sig=0x206d7, pf=0x1, revision=0x710
[ 0.892908] microcode: CPU11 sig=0x206d7, pf=0x1, revision=0x710
[ 0.893022] microcode: Microcode Update Driver: v2.01 <tigran@aivazian.fsnet.co.uk>, Peter Oruba
same applies to crash.c attached above by Adam.
broken: fresh installation using arch-bootstrap.sh and the up-to-date package repository.
functional: fresh installation using arch-bootstrap.sh and a package repository snapshot from December 8th, then run "pacman -Sy; pacman -Su" which installs 24 package updates (bash-4.4.005-2 coreutils-8.26-1 filesystem-2016.12-2 geoip-database-20161206-1 gnupg-2.1.16-2 gnutls-3.4.17-1 icu-58.2-1 iproute2-4.9.0-1 libarchive-3.2.2-1 libgcrypt-1.7.5-1 libsystemd-232-6 libunistring-0.9.7-1 linux-lts-4.4.39-1 logrotate-3.11.0-1 man-db-2.7.6.1-2 man-pages-4.09-1 man-pages-de-1.18-1 nano-2.7.2-1 ncurses-6.0+20161203-1 openssh-7.4p1-1 pacman-mirrorlist-20161214-1 readline-7.0.001-1 systemd-232-6 systemd-sysvcompat-232-6)
- the package list (output of 'pacman -Qs|grep -v "^ "|cut -d/ -f2-') is 100% identical between both installations
- the installation script (arch-bootstrap.sh) is also exactly identical, only the given mirror repo differs (official mirror versus snapshot of official mirror)
My conclusion is that one of the updated packages behaves differently if it is installed in chroot by arch-bootstrap.sh, or on the booted system by pacman -Su. probably some ldconfig hooks?
My intuition and prior experiences tell me I should automatically blame systemd :-), but I can't prove it (yet).
The above workaround is active at OVH and you can now relaunch your installation.
Otherwise, it looks like a post_install scriptlet has not run in the first install.
@Alan: yes, the snapshot used is unmodified, and really just a snapshot of our official ArchLinux mirror on this date. A simple reinstall of the mentioned packages also doesn't change behaviour.
I'm not an Arch/pacman expert, so please get in touch on #archlinux-bugs or per email (firstname dot lastname @ovh.net) if you want shell access to do some tests on your own. I can easily re-install the servers in question, there's nothing to lose :)
> The application shall not modify the structure to which the return value points, nor any storage areas pointed to by pointers within the structure.