FS#19248 - GCC produces corrupted binaries (Core i7, -march=native, x86_64)

Attached to Project: Arch Linux
Opened by Andrej Podzimek (andrej) - Monday, 26 April 2010, 04:18 GMT
Last edited by Allan McRae (Allan) - Saturday, 22 May 2010, 07:59 GMT
Task Type Bug Report
Category Packages: Core
Status Closed
Assigned To Allan McRae (Allan)
Architecture x86_64
Severity Critical
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 5
Private No

Details

Description:

I have always compiled my kernel with KCFLAGS='-march=native -O2 -pipe". Last time I did it, my kernel crashed with a lot of "invalid opcode" trap events. This happened after the upgrade to GCC 4.5.

Not only the kernel is affected. Other programs seem to have problems as well.

named[16311] trap invalid opcode ip:7fed777519ff sp:7fffd44622d0 error:0 in libisc.so.60.1.4[7fed7772d000+52000]
blkid[1266] trap invalid opcode ip:3417611b0e sp:7fff70fb96e0 error:0 in libblkid.so.1.1.0[3417600000+1c000]
hald-probe-volu[5174] trap invalid opcode ip:3417611b0e sp:7fffb0699c50 error:0 in libblkid.so.1.1.0[3417600000+1c000]
ld[8577] trap invalid opcode ip:426dd0 sp:7fff0a9754f8 error:0 in ld[400000+87000]

Additional info:
* package version(s)
4.5.0-1

Steps to reproduce:
Take a Core i7 (820QM, for instance), build your kernel with KCFLAGS="-march=native -O2 -pipe" and boot... (Well, you won't boot.)

This is obviously an upstream bug, either wrong CPU detection or something more intricate. Should I report it somewhere else? BTW, it would be nice to have a "gcc44" package until this gets fixed.

Pieces of advice like "man, why don't you use -march=core2?" sound great, ;-) but I just think -march=native *should* work.
This task depends upon

Closed by  Allan McRae (Allan)
Saturday, 22 May 2010, 07:59 GMT
Reason for closing:  Fixed
Additional comments about closing:  gcc-4.5.0-2
Comment by Allan McRae (Allan) - Monday, 26 April 2010, 04:46 GMT
So, manually setting the -march to the correct value works?
Comment by Allan McRae (Allan) - Monday, 26 April 2010, 04:58 GMT
What does:
gcc -Q -march=native --help=target | grep march

output?
Comment by Andrej Podzimek (andrej) - Monday, 26 April 2010, 05:23 GMT
1) I have no idea what the "correct" value is for Core i7. :-)
2) -march= atom

Atom??? Does that mean GCC thinks I have an Atom?

[andrej@octopus pkg]$ cat /proc/cpuinfo | grep 'model name'
model name : Intel(R) Core(TM) i7 CPU Q 820 @ 1.73GHz
model name : Intel(R) Core(TM) i7 CPU Q 820 @ 1.73GHz
model name : Intel(R) Core(TM) i7 CPU Q 820 @ 1.73GHz
model name : Intel(R) Core(TM) i7 CPU Q 820 @ 1.73GHz
model name : Intel(R) Core(TM) i7 CPU Q 820 @ 1.73GHz
model name : Intel(R) Core(TM) i7 CPU Q 820 @ 1.73GHz
model name : Intel(R) Core(TM) i7 CPU Q 820 @ 1.73GHz
model name : Intel(R) Core(TM) i7 CPU Q 820 @ 1.73GHz
Comment by Andrej Podzimek (andrej) - Monday, 26 April 2010, 05:25 GMT
This is interesting: http://forums.gentoo.org/viewtopic-p-6248668.html

"I'm however a little puzzled why it uses march=atom for core i7 (lynnfield family) (mtune=core2 should be clear) - strange :?"
Comment by Allan McRae (Allan) - Monday, 26 April 2010, 05:41 GMT
Yeah - I had seen that. Someone using a newer snapshot seems to have that fixed. I will build a new 4.5 snapshot and post it somewhere for you to test.
Comment by Andrej Podzimek (andrej) - Monday, 26 April 2010, 05:44 GMT
BTW, does Atom have any instructions that Core i7 wouldn't implement? I'll try to build a kernel with "-march=core2 -msse4 -msahf -mcx16 -O2 -pipe" and see what happens. (Using the current GCC 4.5 from the repo.)
Comment by Allan McRae (Allan) - Monday, 26 April 2010, 08:54 GMT
Try packages from here: http://allanmcrae.com/packages/gcc-4.5-20100422/ to see if a newer 4.5 snapshot helps.
Comment by Andrej Podzimek (andrej) - Monday, 26 April 2010, 12:58 GMT
The prerelease still says it's an Atom. The output from 'gcc -Q -march=native --help=target' says -mtune=core2, but -march=atom. SSE 4.1 and 4.2 are all enabled, which is correct. I'm attaching the output. Anyway, even if Atom was really the best choice, binaries produced by gcc -march=native just fail.

Binaries compiled with '-march=core2 -msse4 -msahf -mcx16 -O2 -pipe' seem to work just fine.
Comment by Andrej Podzimek (andrej) - Monday, 26 April 2010, 13:00 GMT
Well, the attachmet doesn't seem to appear, trying again...
   output (3.6 KiB)
Comment by Allan McRae (Allan) - Monday, 26 April 2010, 13:29 GMT
Can you post the output using -march=core2 as well so I can see the difference?
Comment by Andrej Podzimek (andrej) - Monday, 26 April 2010, 14:08 GMT
Here it is.
With just -march=core2 (and no other options), sse2 and ssse3 seem to be disabled. That's weird... AFAIK, Core2 implements all that stuff.
   output (3.6 KiB)
Comment by Andrej Podzimek (andrej) - Monday, 26 April 2010, 14:13 GMT
This output comes from '-march=core2 -msse4 -msahf -mcx16'. Seems much more suitable for Core i7. Binaries compiled with these options (including the kernel) run just fine.
   output (3.6 KiB)
Comment by Allan McRae (Allan) - Monday, 26 April 2010, 22:03 GMT
The only difference between the atom and the last core2 option is -mpopcnt. Core i7's should support this... can you add that as see if the binaries still work?
Comment by Thomas Bächler (brain0) - Thursday, 29 April 2010, 15:49 GMT
Changing the kernel's CFLAGS manually was never supported upstream. Kernels compiled with the default CFLAGS that the kernel sets work fine.
Comment by Tomas Mudrunka (harvie) - Sunday, 02 May 2010, 20:11 GMT
brain0: i think that kernel should build fine with march=native... if not, there's something wrong with gcc or kernel makefiles... but maybe you just need to set it using "make config"...
Comment by Allan McRae (Allan) - Sunday, 09 May 2010, 13:03 GMT
@Andrej: Is this just the kernel or other software as well?
Comment by Andrej Podzimek (andrej) - Sunday, 09 May 2010, 16:50 GMT
@Allan: It's other software as well. Four binaries causing 'invalid opcode' are mentioned in the original issue report (named, blkid, ld, hald-probe-volu). Recompiling their packages with '-march=core2 -msse4 -msahf -mcx16' fixed it. Presumably, there were more failing binaries on my system completely built from ABS.
Comment by Allan McRae (Allan) - Sunday, 09 May 2010, 22:38 GMT
Please follow up with upstream. There does not appear to be a relevant bug report opened yet.
Comment by Allan McRae (Allan) - Tuesday, 11 May 2010, 05:48 GMT
Found this: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44046 . Will be fixed when I do the next gcc rebuild.

Comment by Anish Bhatt (anish) - Saturday, 15 May 2010, 03:13 GMT
is there a timeframe on when you'll do the rebuild Allan ? and will this go to testing first ?
Comment by Allan McRae (Allan) - Saturday, 15 May 2010, 03:19 GMT
I intended to do this with glibc-2.12 but although that has been tagged it is more of a RC than a real release so that has been delayed. I would also like this to be backported: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43987 . Maybe I will just do the backport myself later in the weekend and put a new snapshot in [testing].
Comment by Marcel Korpel (Marcel-) - Thursday, 20 May 2010, 12:35 GMT
“BTW, does Atom have any instructions that Core i7 wouldn't implement?”
Yes, it does: the MOVBE instruction, which is unique to Intel Atoms. But it appears to be disabled when you select 'gcc -Q -march=native --help=target' (as shown in the first attachment).
Comment by Allan McRae (Allan) - Friday, 21 May 2010, 06:03 GMT
gcc-4.5.0-2 in [testing]

Loading...