FS#54240 - [mkinitcpio-busybox][glibc] Segfault with glibc-2.25-2 at boot time

Attached to Project: Arch Linux
Opened by Natrio (natrio) - Wednesday, 31 May 2017, 12:57 GMT
Last edited by Bartłomiej Piotrowski (Barthalion) - Sunday, 18 June 2017, 22:02 GMT
Task Type Bug Report
Category Packages: Core
Status Closed
Assigned To Bartłomiej Piotrowski (Barthalion)
Architecture i686
Severity High
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 5
Private No

Details

Description:
After glibc update from 2.25-1 to 2.25-2 and rebuilding of initramfs system can not boot anymore, because of busybox init process crash.
Rollback to glibc-2.25-1 and initramfs rebuilding fixes the problem.

Steps to reproduce:
install glibc-2.25-2 and run
/usr/lib/initcpio/busybox ash
(for example, mkinitcpio-busybox with glibc-2.25-2 segfaults on some else commands, not init only)
This task depends upon

Closed by  Bartłomiej Piotrowski (Barthalion)
Sunday, 18 June 2017, 22:02 GMT
Reason for closing:  Fixed
Comment by Natrio (natrio) - Wednesday, 31 May 2017, 13:32 GMT
UPD:
Simply rebuilded mkinitcpio-busybox 1.25.1 also segfaulted.
Latest 1.26.2 version seems to be good (on ash test)
Comment by Natrio (natrio) - Wednesday, 31 May 2017, 13:46 GMT
UPD2:
Latest 1.26.2 (without any patches) version also segfaulted at boot time.
Comment by Dave Reisner (falconindy) - Thursday, 01 June 2017, 11:12 GMT
What commands crash? How can I reproduce this?
Comment by Natrio (natrio) - Thursday, 01 June 2017, 12:06 GMT
> What commands crash? How can I reproduce this?
/usr/lib/initcpio/busybox ash
(on glibc-2.25-2 i686)
I mean, busybox have segfault trying to START the interactive shell, without any command INSIDE it.

But primary manifestation of this bug is exactly segfault of busybox init process at boot time.
Tested on Intel Celeron G530 and Intel Pentium G2010, both i686 Arch.
Comment by Dave Reisner (falconindy) - Thursday, 01 June 2017, 12:14 GMT
Your best bet is to provide a stack trace to upstream busybox. I don't have the time or resources for i686-specific bugs (https://www.archlinux.org/news/phasing-out-i686-support/).
Comment by Natrio (natrio) - Friday, 02 June 2017, 10:46 GMT
Arch [core] mkinitcpio-busybox-1.25.1 :
--------------------------------------
(gdb) run ash
Starting program: /tmp/bysybox/busybox ash

Program received signal SIGSEGV, Segmentation fault.
0xb7ee611d in __strcspn_sse42 () from /usr/lib/libc.so.6
(gdb) backtrace
#0 0xb7ee611d in __strcspn_sse42 () from /usr/lib/libc.so.6
#1 0x0805e760 in ?? ()
--------------------------------------
(gdb) run ipaddr
Starting program: /tmp/bysybox/busybox ipaddr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 1500 qdisc noqueue qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

Program received signal SIGSEGV, Segmentation fault.
0xb7ee63c3 in __strspn_sse42 () from /usr/lib/libc.so.6
(gdb) backtrace
#0 0xb7ee63c3 in __strspn_sse42 () from /usr/lib/libc.so.6
#1 0x0807b2ee in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)


Upstream busybox 1.26.2 :
--------------------------------------
(gdb) run ipaddr
Starting program: /tmp/bysybox/1.26.2/busybox ipaddr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 1500 qdisc noqueue qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

Program received signal SIGSEGV, Segmentation fault.
0xb7ee63c3 in __strspn_sse42 () from /usr/lib/libc.so.6
(gdb) backtrace
#0 0xb7ee63c3 in __strspn_sse42 () from /usr/lib/libc.so.6
#1 0x0807aebc in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
Comment by Natrio (natrio) - Friday, 02 June 2017, 10:56 GMT
The bug is in the glibc-2.25-2-i686 package (rebuilded by GCC-7 without patches), it segfaulting not only with bysubox, also VirtualBox GUI too.

I made static build (CONFIG_STATIC=y) of upstream busybox 1.26.2, it much bigger (1.3M vs 325K binary), but works fine in initramfs, without any errors, regardless of GCC-6 or GCC-7, used for building.
Comment by Benjamin Robin (benjarobin) - Monday, 05 June 2017, 22:54 GMT
I did put some information here https://bugs.archlinux.org/task/54316 sorry for the duplicate...

Maybe the bug is inside glibc, or in gcc... The package glibc-2.25-1 was built with gcc 6
I did try to rebuild glibc-2.25-1 with gcc 7 and I do have the same crash in ../sysdeps/x86_64/multiarch/strcspn-c.c:96
Comment by Benjamin Robin (benjarobin) - Monday, 05 June 2017, 23:18 GMT
Update: I did a full revert of my system to the following date 2017/03/12
Then I did the following tests (the version of busybox is not relevant) :
* busybox + Custom glibc 2.25-1 built with gcc 7.1.1-2 => Crash
* busybox + Custom glibc 2.25-2 built with gcc 7.1.1-2 => Crash
* busybox + glibc 2.25-2 (from arch repository, which was built with gcc 7) => Crash
* busybox + Custom glibc 2.25-2 built with gcc 6.3.1 => Ok
* busybox + glibc 2.25-1 (from arch repository, which was built with gcc 6) => Ok
Comment by Natrio (natrio) - Tuesday, 06 June 2017, 07:03 GMT
Just in case:
static busybox (built with gcc 7.1.1-2 and glibc 2.25) => OK

To get a static busybox build, find the "CONFIG_STATIC" line in the "config" file and fix it to
CONFIG_STATIC = y
Comment by Bartłomiej Piotrowski (Barthalion) - Wednesday, 07 June 2017, 09:20 GMT
Given incoming i686 deprecation, I'll rather make it statically linked for that architecture. I will put it in [testing] soon, let me know if it works.
Comment by Natrio (natrio) - Wednesday, 07 June 2017, 14:26 GMT
mkinitcpio-busybox 1.25.1-2 i686 in [testing] works, even with glibc-2.25-2-i686.

But glibc-2.25-2-i686 package causes segfaults not only in busybox (but VirtualBox GUI, for example, mkinitcpio-busybox is just worst case), so I will keep it rolled back to glibc-2.25-1 on i686 system.
Comment by Benjamin Robin (benjarobin) - Wednesday, 07 June 2017, 17:46 GMT
@Barthalion @natrio The static build of mkinitcpio-busybox (in testing) is only working because you build mkinitcpio-busybox by using the static lib of glibc-2.25 that was built with GCC 6. I did check the assembly code since I did not understand why the static build was working. The static build should produce the same result in the same condition => We do not have the same condition here...

So @Barthalion you fix a particular case, not the whole problem.

We really should create a bug report upstream, but I have no idea where is the problem (glibc or gcc) ?
Comment by Natrio (natrio) - Wednesday, 07 June 2017, 18:13 GMT
I already wrote that I used to for static build busybox-1.26.2, glibc-2.25-2 and GCC-7, and it worked without segfaults.

Problem is only in the dynamic libc.so.6 binary from glibc-2.25-2-i686.

If time permits, I'll try to compile glibc with GCC-7 on i686 with different options and check what the segfault in __strspn_sse42 () depends on.
Comment by Benjamin Robin (benjarobin) - Wednesday, 07 June 2017, 21:12 GMT
Well I am curious to see the assembly of __strspn_sse42 of your static build of busybox. If you can send the binary somehow, since I cannot reproduce it
Comment by Benjamin Robin (benjarobin) - Thursday, 08 June 2017, 18:42 GMT
Hum, I did many mistake, the 2 last messages (3rd and 4th) are just wrong (The second message is still true), I am sorry... I should sleep more...
Comment by steadfasterX (steadfasterX) - Friday, 16 June 2017, 07:18 GMT
I can confirm this.. I cannot build any i686 version atm due to this.. Downgrading is not an option for me and I hope that this get fixed soon... If I can do anything to help let me know.

Just saw the comment https://bugs.archlinux.org/task/54240#comment158296 - I will test and report back asap!

@Barthalion: I've upgraded glibc to testing -> 2.25-3 with no change. I still get a kernel panic.. Do I need to upgrade any other package from testing?
Comment by Natrio (natrio) - Friday, 16 June 2017, 18:24 GMT
steadfasterX, if you have i686 Arch, you need to upgrade mkinitcpio-busybox and do
mkinitcpio -P
before reboot. This build is not depend on glibc and works well.
Comment by steadfasterX (steadfasterX) - Friday, 16 June 2017, 18:56 GMT
> you need to upgrade mkinitcpio-busybox

from testing?
Comment by Bartłomiej Piotrowski (Barthalion) - Friday, 16 June 2017, 20:14 GMT
No, I moved it to [core] over 12 hours ago.
Comment by Luke (slacka) - Sunday, 18 June 2017, 05:33 GMT
Updating to glibc 2.25-3 resolved this issue for me. Thanks for the quick response in getting this fixed!
Comment by steadfasterX (steadfasterX) - Sunday, 18 June 2017, 07:37 GMT
Now having both packages upgraded seems to solve the problem for me. ty @Barhalion

Loading...