FS#25442 - grub-install fails with: Error 6: Mismatched or corrupt version of stage1/stage2

Attached to Project: Arch Linux
Opened by kachelaqa (kachelaqa) - Monday, 08 August 2011, 00:13 GMT
Last edited by Ronald van Haren (pressh) - Wednesday, 10 August 2011, 06:49 GMT
Task Type Bug Report
Category Packages: Core
Status Closed
Assigned To Ronald van Haren (pressh)
Architecture i686
Severity High
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 10
Private No

Details

Description:

after updating to the linux 3 kernel, grub-install fails with the following output:

grub> root (hd0,0)
Filesystem type is ext2fs, partition type 0x83
grub> setup --stage2=/boot/grub/stage2 --prefix=/grub (hd0)
Checking if "/grub/stage1" exists... yes
Checking if "/grub/stage2" exists... yes
Checking if "/grub/e2fs_stage1_5" exists... yes
Running "embed /grub/e2fs_stage1_5 (hd0)"... 20 sectors are embedded.
succeeded
Running "install --stage2=/boot/grub/stage2 /grub/stage1 (hd0) (hd0)1+20 p (hd0,0)/grub/stage2 /grub/menu.lst"... failed
Error 6: Mismatched or corrupt version of stage1/stage2
grub> quit

Additional info:
* package version(s)
linux-3.0.1-1
grub-0.97-19

* other
/boot is on a separate ext2 partition
(all other partitions are ext3)
This task depends upon

Closed by  Ronald van Haren (pressh)
Wednesday, 10 August 2011, 06:49 GMT
Reason for closing:  Fixed
Additional comments about closing:  grub 0.97-20. If it still doesn't work for you, check the comments in this bug report for guidelines.
Comment by Ronald van Haren (pressh) - Monday, 08 August 2011, 07:31 GMT
installing grub has nothing to do with the kernel...

Maybe I'm reading this wrong, but

Running "install --stage2=/boot/grub/stage2 /grub/stage1 (hd0) (hd0)1+20 p (hd0,0)/grub/stage2 /grub/menu.lst"... failed

it is looking for the stage1 file in a different location than the stage2 file?
Comment by kachelaqa (kachelaqa) - Monday, 08 August 2011, 10:20 GMT
the reason i mentioned the kernel *update* was that i assumed it had entailed some changes to grub (because of the new naming scheme).

but whatever - if i downgrade grub to 0.97-17, the problem goes away.

(nb: duplicate of this bug:  FS#25449 )

Comment by Ronald van Haren (pressh) - Monday, 08 August 2011, 10:24 GMT
To further nail this down, does the problem also exist with grub 0.97-18?
Comment by Ionut Biru (wonder) - Monday, 08 August 2011, 10:26 GMT
-19 didn't had any modification for install-grub script. only modification for the new schema on menu.lst, that's all
Comment by kachelaqa (kachelaqa) - Monday, 08 August 2011, 10:40 GMT
@ronald
sorry, ARM doesn't have grub 0.97-18. do you still need me test it?
Comment by Jonathan Liu (net147) - Monday, 08 August 2011, 10:51 GMT
grub 0.97-18 doesn't work and grub 0.97-17 doesn't build anymore with latest compiler.
Comment by Ronald van Haren (pressh) - Monday, 08 August 2011, 11:06 GMT
@Ionut: I know, just to be sure it is not a faulty build.

@all: Are you all running a separate /boot partition or should I also be able to reproduce this with /boot on the root partition?
Comment by Jonathan Liu (net147) - Monday, 08 August 2011, 11:08 GMT
/boot is on separate partition for me. I've rebuilt grub 0.97-18 and grub 0.97-19 myself to make sure it isn't just a bad binary package and both have the issue.
Comment by Ronald van Haren (pressh) - Monday, 08 August 2011, 11:17 GMT
The only significant change wrt -17 is the objcopy patch IIRC. It will be another couple of hours before I have time to look into this.
Comment by kachelaqa (kachelaqa) - Monday, 08 August 2011, 11:18 GMT
just to clarify: the main reason i mentioned the separate boot partition is because i read in a few places that having a non-ext3 boot fs can sometimes cause problems with grub-install.
Comment by Ronald van Haren (pressh) - Monday, 08 August 2011, 11:22 GMT
Yes I came across that as well (and a fix for that on the ubuntu bug tracker), but that was only for non-ext3 / partitions.

Also I did test -18 last week before I pushed it and it installed just fine (but I don't have a separate /boot partition), so that's why I asked.
Comment by Jonathan Liu (net147) - Monday, 08 August 2011, 11:25 GMT
@Ronald
Did you copy /usr/lib/grub/i386-pc/* to /boot/grub/ when you were testing?
Comment by Ronald van Haren (pressh) - Monday, 08 August 2011, 11:30 GMT
@Jonathan: I was testing the modifications on the install-grub script which does copy those files.
Comment by Jonathan Liu (net147) - Monday, 08 August 2011, 11:32 GMT
On another note, grub 0.97-19 works fine on x86_64
Comment by Ronald van Haren (pressh) - Monday, 08 August 2011, 11:40 GMT
IC, that's where I actually tested it on as I don't have a working i686 installation.
Comment by Jonathan Liu (net147) - Monday, 08 August 2011, 11:43 GMT
If I use install-grub script it freezes.
Comment by Ronald van Haren (pressh) - Monday, 08 August 2011, 11:47 GMT
Uninstall xfsprogs if you don't have a xfs partition (there should probably a check for that in the script, but is was there already so I didn't change it. Although the xfs_freeze utility could check before it freezes the whole system...).
Comment by Jonathan Liu (net147) - Monday, 08 August 2011, 11:49 GMT
Thanks, that allows the script to complete. It also gives the same error message as initially reported.
Comment by Jonathan Liu (net147) - Monday, 08 August 2011, 14:17 GMT
I think I found the problem. Compiling grub gives several "dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]" warnings. It is generating incorrect code as strict-aliasing optimisation is enabled for -Os which the code is being compiled with. To fix the issue, add -fno-strict-aliasing to CFLAGS.
Comment by Ronald van Haren (pressh) - Monday, 08 August 2011, 14:23 GMT
Great, thanks!

I'll release an update with the fix when I'm home from work.
Comment by Jonathan Liu (net147) - Monday, 08 August 2011, 14:59 GMT
Might be good idea to check if filesystem is xfs in install-grub while we're at it.
Comment by Anonymous Submitter - Monday, 08 August 2011, 15:49 GMT
@ronald: Just FYI these are the CFLAGS used in Fedora's grub-legacy spec file "-Os -g -fno-strict-aliasing -Wall -Wno-error -Wno-shadow -Wno-unused -Wno-pointer-sign -m32". Might help but even with this even Fedora's grub package built with gcc 4.6 seems to break.
Comment by Ronald van Haren (pressh) - Monday, 08 August 2011, 17:38 GMT
@Jonathan: added the xfs stuff

Can someone please test (most notably the i686 package) before I push it to testing:

http://dev.archlinux.org/~ronald/packages/grub-0.97-20-i686.pkg.tar.xz
http://dev.archlinux.org/~ronald/packages/grub-0.97-20-x86_64.pkg.tar.xz
Comment by kachelaqa (kachelaqa) - Monday, 08 August 2011, 17:55 GMT
the i686 package is working for me. thanks
Comment by James Sharp (thermosuk) - Monday, 08 August 2011, 22:34 GMT
Not an expert with this sort of thing but I downloaded and installed testing version of grub (0.97-20) during an installation and grub still gives the same error.

This is with a netinstall, /boot is ext4.
Comment by Jonathan Liu (net147) - Monday, 08 August 2011, 22:47 GMT
@Ronald
i686 and x86_64 packages provided in your links and in [testing] are working for me.
Comment by Ashley Heron (breakage) - Monday, 08 August 2011, 23:07 GMT
@Ronald i686 package also working for me, on my main PC.

Is there anyway to push this too a fresh i686 netinstall i'm attempting via virtualbox?
Or will I have to wait to install; when it's out of testing and in main?
Comment by pedram (multiphrenic) - Tuesday, 09 August 2011, 00:46 GMT
I am reinstalling Arch (after having upgraded to 3.0) from a USB. Installed Grub 0.97-20-i686 before running the final Install Bootloader step but got the same message.
Comment by taylorchu (taylorchu) - Tuesday, 09 August 2011, 02:27 GMT
the bug still exists in Grub 0.97-20-i686
Comment by Ronald van Haren (pressh) - Tuesday, 09 August 2011, 07:01 GMT
@Ashley: Long time since I did a net-install, but I suppose you could enable testing there as well or create a custom repo?
Comment by Ronald van Haren (pressh) - Tuesday, 09 August 2011, 07:08 GMT
So what is the common denominator for those that still experience this? Do you all have a non ext2/3 /boot partition, something else?
Comment by taylorchu (taylorchu) - Tuesday, 09 August 2011, 07:56 GMT
i have ext4 for boot.
Comment by Jonathan Liu (net147) - Tuesday, 09 August 2011, 07:59 GMT
What timestamp do you get for the stage files if you do: ls -l /boot/grub?
Comment by James Sharp (thermosuk) - Tuesday, 09 August 2011, 08:00 GMT
I used ext4 as well.
Comment by taylorchu (taylorchu) - Tuesday, 09 August 2011, 08:02 GMT
2011-08-08 11:08 for stages.
2011-08-08 12:29 for menu.lst
Comment by Jonathan Liu (net147) - Tuesday, 09 August 2011, 08:19 GMT
Are the timestamps for "ls -l /boot/grub" the same as "ls -l /usr/lib/grub/i386-pc"?
Comment by taylorchu (taylorchu) - Tuesday, 09 August 2011, 08:30 GMT
no.
ls -l /boot/grub : 2011-08-08 11:08
ls -l /usr/lib/grub/i386-pc : 2011-08-08 12:29
Comment by Jonathan Liu (net147) - Tuesday, 09 August 2011, 08:36 GMT
Use the following command after upgrading to grub 0.97-20 if you install grub manually from grub shell or use grub-install:
cp -a /usr/lib/grub/i386-pc/* /boot/grub/
Comment by taylorchu (taylorchu) - Tuesday, 09 August 2011, 08:45 GMT
ok. it works for me.
after one more confirmation, try to move this to core asap. someone might need this.

basically, I just uninstall and reinstall grub.
Comment by Ashley Heron (breakage) - Tuesday, 09 August 2011, 19:25 GMT
@Ronald Already tried enabling testing, but it was after core had installed and before i'd configured bootloader had filename errors - not sure if thats why it messed up.
The filesystem on the vbox is set up w/ auto configure. / & /home are ext4 but i'm sure it autoconfigures /boot as ext2? The vbox install can wait until it's all ironed out so no worries, as my main install is fine.
Comment by Vladyslav Chyzhevskyi (coirius) - Tuesday, 09 August 2011, 20:27 GMT
When a new grub build be moved to core repository? Else, can you tell me how I can install Arch with new Kernel 3.0 and with old grub 0.97-19?
Comment by Ronald van Haren (pressh) - Wednesday, 10 August 2011, 06:48 GMT
@Vladyslav: moved to core

Loading...