FS#19733 - Update to glibc 2.12-2 on VIA C3 Nehemia makes system unusable
Attached to Project:
Arch Linux
Opened by Manfred Miederer (LessWire) - Monday, 07 June 2010, 01:41 GMT
Last edited by Allan McRae (Allan) - Monday, 25 October 2010, 01:01 GMT
Opened by Manfred Miederer (LessWire) - Monday, 07 June 2010, 01:41 GMT
Last edited by Allan McRae (Allan) - Monday, 25 October 2010, 01:01 GMT
|
Details
CPU: VIA C3 Nehemia and latest glibc
I use archlinux with this system since over a year now WITHOUT any problems - until yesterday. What has happened: upgrading binutils 2.20.1-2 => 2.20.1-3 upgrading glibc 2.11.1-3 => 2.12-2 After pacman has installed the new glibc, i get an "invalid opcode" and the pacman script breaks. From this time no further command which uses glibc can be run, i always get "invalid opcode". Rebooting the machine crashes the kernel immediately after mounting the "real" rootfs. This looks like this new version of glibc was not compiled for CPU = "generic" ? Downgrading to the old version of glibc (= 2.11.1-3) all works fine. Steps to reproduce: Simply do this update on a machine with this CPU. |
This task depends upon
Closed by Allan McRae (Allan)
Monday, 25 October 2010, 01:01 GMT
Reason for closing: Fixed
Additional comments about closing: As much as it is going to be...
Monday, 25 October 2010, 01:01 GMT
Reason for closing: Fixed
Additional comments about closing: As much as it is going to be...
http://forum.openvz.org/index.php?t=msg&goto=3497& - "VIA C3 isn't pure i686 processor. AFAIK it's i386."
looks like your processor is i386 not i686.
More reads: http://blueneon.xidus.net/bn/2005/06/05/gentoo-on-the-via-c3/
This is a c3 "nehemiah" which is like a 686 for sure.
I run it since several years now at first using Debian and since over a year i use archlinux. I never had any issues, i did a lot of updates and everything worked fine! Remember that the kernel itself is the most critical thing and yes, it runs perfectly (currently 2.6.34).
you are right, there are c3 processors which are not compatible, but this one is (nehemiah)!
"Update, to disambiguate things: The Nehemiah core and later fully support the CMOV instruction mentioned below! As far as I know, you can use i686 as the CHOST for those."
That's true, I agree. ;-)
where can i get 2.12-1 ? (didn't have it in the core repo).
If the use of always same CFLAGS is guaranteed and nevertheless an "invalid opcode" is thrown, i would presume a deficient working compiler for this cpu type. I saw the latest version of gcc coming in concurrently with glibc.
I think compiling the new lib with an older gcc version could be the solution. As I don't want to steal you more time I could try that for myself but this via system has not enough space to install all the tools necessary.
Thanks Allan, i downloaded it and i will look to try it this weekend. This 24/7 system is standalone without kbd/screen, BIOS doesn't like boot from usb and it will cost time if it fails and must be downgraded within the "busybox shell" again :(
After a recent update (20101512, pacman -Syu) i get `Illegal instruction' with every command.
This is on a system with the same cpu as OP.
To confirm, i did a clean core install (archlinux-2010.05-core-i686.iso). This install worked fine.
After updating linux-api-headers i installed the glibc-2.12-2.1 package attached to this bug.
I again got a `Illegal instruction' with every command.
Dowgrading to glibc-2.11.1-3 (from the iso) fixed my system.
No need to test 2.12-2.1 for myself.
That will take about 10 glibc builds to track down. I do not have that particular hardware, so I can provide a package for you to test and tell me if it is working then I will provide another package.... This will take a few weeks!
Of course, if either of you can do the bisect yourself, this will be much faster? If you can not, I will start uploading more test packages and waiting on your reports.
http://dev.archlinux.org/~allan/glibc-2.12-2.2-i686.pkg.tar.xz
(glibc-24c0bf7a)
But probably a simple solution would be, if i could get a statical linked version with glibc <=2.11.1 of "rsync". Or is there a way, to prepare another libpath for rsync before the test?
2.12-2.2 runs !!!
I used "pacman -U ..." and i had to delete "/etc/ld.so.cache" first to get it installed.
Next:
http://dev.archlinux.org/~allan/glibc-2.12-2.3-i686.pkg.tar.xz
(glibc-c60bce2c)
http://bugs.archlinux.org/task/19806?project=1&string=glibc&search_name=&type%5B0%5D=&sev%5B0%5D=&pri%5B0%5D=&due%5B0%5D=&reported%5B0%5D=&cat%5B0%5D=&status%5B0%5D=open&percent%5B0%5D=&opened=&dev=&closed=&duedatefrom=&duedateto=&changedfrom=&changedto=&openedfrom=&openedto=&closedfrom=&closedto=
I guess the code is not too cpu specific and i think the compiler (gcc 4.5) stays under suspicion furthermore ;-)
Yes, 2.12-2.3 runs !
2.12-2.3 works (i did edit last comment. so maybe this was overlooked)
http://dev.archlinux.org/~allan/glibc-2.12-2.4-i686.pkg.tar.xz
(glibc-463ed2f0)
2.12-2.4 doesn't work, it's the same behaviour as the upgrade in origin.
Arrrgh! I can set LD_LIBRARY_PATH, but rsync doesn't follow --> invalid opcode :(
Next:
http://dev.archlinux.org/~allan/glibc-2.12-2.5-i686.pkg.tar.xz
(glibc-2fe000df)
50/50? I have to look to make things easier so i can work in a ssh shell only. I think a chroot environment could be fine. what do you think?
Back to the topic:
Just for a try i restored only libc-2.11.90.so (from 2.12-2.3), but that doesn't help.
Restored all from 2.12-2.3 now and installed 2.12-2.5 and it doesn't work.
http://dev.archlinux.org/~allan/glibc-2.12-2.6-i686.pkg.tar.xz
(glibc-741895aa)
I think I see the relevant change... if I am correct, then this package should work.
... and you are correct - 2.12-2.6 works.
Can you say it with a few words: what's the cause ?
http://sourceware.org/git/?p=glibc.git;a=commit;h=01f1f5ee
However, that is one on 20 possible commits so it is still not confirmed. I will rebuild the current glibc with just that commit reversed now to confirm this is the issue.
i have a chroot environment now, each test (and restore of old libs) is finished after a few minutes.
Just that one commit reversed... lets see how good a t spot the breakage I am!
So now I just have to figure how to properly fix this.
Reading the Fedora discussion, the cause is a "nopl" instruction, which never has been described in Intel specs. That's sloppy or they did it with intention ? :(
I compiled some kernels on this machine and i know that it must have "-march/mtune=native" or "generic" and not "i686". That's why i have asked for the "generic" switch so permanently.
It looks like there are a lot of Fedora users using a Geode cpu, so there should be a solution soon?
Your choices are:
1) build glibc with the causal commit reverted (see link above).
2) build a kernel that emulates NOPL instructions (see http://bbs.archlinux.org/viewtopic.php?pid=775414)
I recommend the custom kernel, as that is likely to be how this is fixed in the long term (although not with that patch exactly) and then you do not run the risk of other software crashing because that instruction gets included. I guess a patch to fix this will eventually work its way into the kernel mainline and thus Arch.
CPUs affected include Via C3, Via Eden, AMD Geode LX (as used in OLPC), Transmeta Crusoe and Virtual PC.
Affected package: binutils
Related Bugreports:
http://www.sourceware.org/bugzilla/show_bug.cgi?id=6957
https://bugzilla.redhat.com/show_bug.cgi?id=579838
The binutils beta contains a fix for this
http://gcc.gnu.org/ml/gcc/2010-08/msg00194.html
My CPU is AMD Phenom X4 9850 and running archlinux x86_64.
Your issue with glibc etc. has another reason not belonging to this thread!
By the way and as Allan wrote above, i always build my kernel for Via Nehemiah with NOPL emulation and everything works fine.
There is no solution at all. There must be a better solution than no solution
The "right" solution:
- update to binutils >= 2.20.51.0.11
- rebuild glibc and other affected packages.
but then the system boots as i586 and you can't install/update packages from repo anymore.
cat /proc/cpuinfo
processor : 0
vendor_id : CentaurHauls
cpu family : 6
model : 9
model name : VIA Nehemiah
stepping : 8
cpu MHz : 666.549
cache size : 64 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr cx8 sep mtrr pge cmov pat mmx fxsr sse up rng rng_en ace ace_en
bogomips : 1333.64
clflush size : 32
cache_alignment : 32
address sizes : 32 bits physical, 32 bits virtual
power management:
here are some outputs from core kernel and my own nopl emu kernel.
http://files.osuv.de/geode/outputs/
As far as I can tell, this is due to the kernel not believing your system is i686. Not much that can be done about that from a toolchain perspective... maybe a kernel update will change that or when the proper fix gets released in binutils (late November). I suggest you keep using a custom kernel until that point.
because it was just opened for the via nehemia.
geode users have to use nopl patch and wait for binutils major update or as you said, ignore system architecture.