FS#15051 - [code included] makepkg: update md5sums directly in PKGBUILD

Attached to Project: Pacman
Opened by Tomas Mudrunka (harvie) - Thursday, 11 June 2009, 03:30 GMT
Last edited by Allan McRae (Allan) - Wednesday, 16 June 2010, 07:01 GMT
Task Type Feature Request
Category makepkg
Status Closed
Assigned To No-one
Architecture All
Severity Medium
Priority Normal
Reported Version 3.2.2
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 4
Private No

Details

Summary and Info:
there is makepkg -g option that will produce array of checksums that can be used in PKGBUILD, but why not update PKGBUILD directly? maybe i can add "source md5sums" line to my PKGBUILD and ship it with file created by "makepkg -g > md5sums", but i think that there can be some more simple way (makepkg -G).

i think that this can be done using available "makepkg -g 2>/dev/null" feature and bit of sed+regexp magic.
i'll try to find some solution and post the code to the comments (if i'll manage to write that)

this is my simple implementation which works for me:

update_hashes() {
sed -ne '1h;1!H;${;g;s/md5sums=([^)]*)/'$(makepkg -p "$1" -g 2>/dev/null)'/g;p;}' "$1" > "$1.new"
#maybe this code should be changed to replace only first occurence of md5sums array...
mv -f "$1" "$1.bak"
mv -f "$1.new" "$1"
}
update_hashes PKGBUILD;


there can be also some easy way to update checksums when packing source packages like: "makepkg --source -g"

any other ideas?
This task depends upon

Closed by  Allan McRae (Allan)
Wednesday, 16 June 2010, 07:01 GMT
Reason for closing:  Won't implement
Additional comments about closing:  Best implemented outside of makepkg. Feel free to submit a script for pacman-contrib
Comment by Tomas Mudrunka (harvie) - Thursday, 11 June 2009, 03:38 GMT
oh i've just found that more hashing alghoritms are supported so sed should look for those: md5|sha1|sha256|sha384|sha512
and they should be replaced by preffered alghoritm (corresponding to makepkg settings - archlinux guidelines)
Comment by Tomas Mudrunka (harvie) - Thursday, 11 June 2009, 03:54 GMT
fixed version matches only first occurence and doesn't matter which alghoritm was there. it will be replaced by makepkg -g output:

update_hashes() {
#sed -ne '1h;1!H;${;g;s/md5sums=([^)]*)/'$(makepkg -p "$1" -g 2>/dev/null)'/g;p;}' "$1" > "$1.new"
sed -ne '1h;1!H;${;g;s/\(md5\|sha\|sha1\|sha224\|sha256\|sha384\|sha512\)sums=([^)]*)/'$(makepkg -p "$1" -g 2>/dev/null)'/1;p;}' "$1" > "$1.new"
mv -f "$1" "$1.bak"
mv -f "$1.new" "$1"
}
update_hashes PKGBUILD;
Comment by Allan McRae (Allan) - Thursday, 11 June 2009, 04:33 GMT
I actually do not mind this idea when using a different flag (i.e. not automatically doing this). "makepkg -G" sounds good.

Now as for the implementation. A couple of things I think could be improved. First, a PKGBUILD can specify multiple hashes which with the above would result in multiple copies of whatever is generated by "makepkg -g". So each hash should be updated separately. That also means not using the "makepkg -g" output but having to generate it.
Comment by Tomas Mudrunka (harvie) - Thursday, 11 June 2009, 15:04 GMT
then you'll have to modify this code to fit your standarts exactly the way you want... ;)
Comment by Pierre Schmitz (Pierre) - Thursday, 11 June 2009, 16:08 GMT
Not sure if that does help, but I am using the following for a long time (not made by myself rhough):

{ rm PKGBUILD; awk '$0 ~ /^md5sums/ {i = 1; system("makepkg -g 2>/dev/null")}; !i {print}; $0 ~ /\)/ {i = 0}' > PKGBUILD; } < PKGBUILD
Comment by Loui Chang (louipc) - Thursday, 06 August 2009, 15:57 GMT
Make a truly unique temp file with mktemp.
People might have a .bak or .new that they actually want.
Comment by Tomas Mudrunka (harvie) - Friday, 14 August 2009, 10:21 GMT
louipc: temporary file contents can be also stored in memory.
Comment by Gavin Bisesi (Daenyth) - Tuesday, 08 September 2009, 21:53 GMT
I like the -G suggestion.
Comment by Ray Rashif (schivmeister) - Friday, 09 October 2009, 14:20 GMT
Funny..I proposed this before but it didn't appear to be that useful for an extra makepkg option (and I've come to agree with that): http://mailman.archlinux.org/pipermail/pacman-dev/2008-March/005939.html
Comment by Xavier (shining) - Friday, 09 October 2009, 14:39 GMT
Ah nice to know, I completely forgot about this patch.
Though it does not address allan's comment either (each hash should be updated separately).

I think both makepkg -g and makepkg -G should look at the hashsums defined in the PKGBUILD and use that.
Comment by Ray Rashif (schivmeister) - Friday, 09 October 2009, 16:31 GMT
Yes that's a good point. But the idea in general also leaves a number of loopholes, especially when a PKGBUILD can have more than just 1 copy of said sums. Imagine someone does -g a number of times, then there would be a sum before build(), and numerous after. I don't know how to make Sed work for that.

Surprisingly I still have that old patch, and a quick look at it again reveals a good number of bad assumptions and flaws =/ But if someone can provide a _failsafe_ awk/sed/perl one-liner to restrict the focus between n sets of /^${integ}sums=('/,/')/ then it will work flawlessly. Once that's taken care of, whatever ${integ}s are used in the current buildscript can be generated separately and safely injected one by one.

If it takes more than that to get it working, we should forget about implementing this.
Comment by Gavin Bisesi (Daenyth) - Friday, 09 October 2009, 16:52 GMT
Some packages have md5sums that vary based on CARCH also, so we'd have to keep that in mind... This seems pretty ugly. Is it acceptable to create a script to do this for pacman-contrib, but that does not cover 100% of cases? Should we encourage using it, with certain caveats?
Comment by Ray Rashif (schivmeister) - Friday, 09 October 2009, 17:25 GMT
Correct. But if that's the case, and it'll be crippled, then why include it in a pkg at all? Might as well encourage in documentation, eg. wiki as pacman/makepkg tricks. However, someone just posted a perl script in the MLs, so might want to check that out.

# ~/.bashrc
# will not work on sums at bottom of pkgbuild each on a newline
mkpk() {
sed -i '/^md5sums=([^)]*)/d' PKGBUILD && makepkg -g >> PKGBUILD && makepkg
}

And you don't even need this if all you're doing is installing for yourself, and not distributing (eg. uploading to AUR). makepkg simply takes the last set it finds.
Comment by Tomas Mudrunka (harvie) - Saturday, 10 October 2009, 13:57 GMT
Daenyth: if you are specifiing hashes in some unsupported way, then you just will not use this feature. maybe there should be some lock in options=() array to prevent you of doing this accidentally. but you will be still able to generate your hashes using makepkg -g and replace them by hand. this -G feature will be used only by maintainers which will know what it does - it will not be called automaticaly. it fits at least 99% of packages and it's just optional feature to save time - not compulsory solution.
Comment by Ray Rashif (schivmeister) - Monday, 12 October 2009, 03:46 GMT
Nevermind, all I needed was a "$" =p

Anyway, this is still sort of crippled (no CARCH support), but integrates fine.

makepkg -G, --getinteg
Comment by Ray Rashif (schivmeister) - Monday, 12 October 2009, 18:09 GMT
Sorry, the last one was defective.
Comment by Xavier (shining) - Tuesday, 13 October 2009, 11:43 GMT
To Ray : what about moving this code to generate_checksums ?

+ # necessary override for --geninteg
+ unset INTEGRITY_CHECK
+
+ for integ in md5 sha1 sha256 sha384 sha512; do
+ grep "${integ}sums=(" $file &> /dev/null && \
+ INTEGRITY_CHECK=(${INTEGRITY_CHECK[@]} $integ)
+ done
+
+ [ ${#INTEGRITY_CHECK[@]} -eq 0 ] && source "$MAKEPKG_CONF"

Then it would be the behavior I proposed earlier : "I think both makepkg -g and makepkg -G should look at the hashsums defined in the PKGBUILD and use that."

Any opinions ?
Comment by Ray Rashif (schivmeister) - Tuesday, 13 October 2009, 20:13 GMT
Indeed. It will simply source makepkg.conf again if there are no sums in the script, so that's good.
Comment by Xavier (shining) - Tuesday, 13 October 2009, 22:50 GMT
Thanks for your patch, it motivated me to get coding.
There are a few things I did not like so I rewrote it a bit.
Like it does not seem necessary to re-source makepkg.conf , you just need to use a different variable name :)
Also I removed some features which seemed a bit too hackish to me, like the check for CARCH or the check for vim modeset line.

I hope I did not break things too much and that you like my version.
Comment by Xavier (shining) - Tuesday, 13 October 2009, 23:10 GMT
Yeah it does not work so well with more complex checksums lines like the following :
[ "$CARCH" = "i686" ] && md5sums=('0c9c81f8f0b85fd96a8db06d36b50b33')

The sed line actually deleted the next stuff after that line, which was the build() function, so that was killed :)

Thinking about it, it seems impossible to support every damn pkgbuilds out there. This feature might be much more dangerous than it is helpful.
If we still really want to do that, we should at the very least add a backup to sed -i line. But I am not sure what happens if we mess up twice in a row, we lose everything too ?

Anyway I like the new behavior of makepkg -g I implemented, but not sure I will support makepkg -G.
Comment by Ray Rashif (schivmeister) - Wednesday, 14 October 2009, 00:20 GMT
Sure, now it looks better =p The vim one was just for cosmetics in case you want to preserve the look. As far as the sed goes, this one appears to be failsafe (will not go beyond "$" which is EOL). It's just the final appearance for multi-arch sums which is troublesome.

If -G is implemented, it'll be like what Daenyth mentioned: "with certain caveats"

Hence the check for "CARCH && sums" so it fails for such cases. I'm also in support of "work for everything not most things" here, so I'd say -G doesn't prove to be worthy. It can be done, and that sed line will work fine, but when it comes to redirecting, eg. echo "[ \"$CARCH\" ... && $(generate_checksums)" >> $script it looks like:

[ "$CARCH" = "i686" ] && md5sums=('0c9c81f8f0b85fd96a8db06d36b50b33')

Sure you can fix it with more sed, but is it appropriate to go that far just for a simple result? Worse for those with more than one sum and line breaks, so I abandoned that idea entirely :)
Comment by Ray Rashif (schivmeister) - Wednesday, 14 October 2009, 00:24 GMT
Uhmm..there were supposed to be a few white spaces between && and md5sums ^
Comment by Xavier (shining) - Wednesday, 14 October 2009, 06:45 GMT
Try to put this line inside a pkgbuild :
[ "$CARCH" = "i686" ] && md5sums=('0c9c81f8f0b85fd96a8db06d36b50b33')
Then run your sed line :
sed -i".bak" "/md5sums=(/,/)$/d" PKGBUILD

here it deletes the next line, so that's anything but safe. If the next line is your build() function, that's gone.

Also I could fool your CARCH grep check by doing it on multiple lines :
[ "$CARCH" = "i686" ] \
&& md5sums=('0c9c81f8f0b85fd96a8db06d36b50b33')

That's why I prefer to not do the check and try to find out why sed does not work here.
Comment by Ray Rashif (schivmeister) - Wednesday, 14 October 2009, 12:15 GMT
Ahh ok, looks like that only happens when those are the only sums or there are more of it (I assumed PKGBUILDs wouldn't be done this way). It works on one with a normal line and an arch-specific line; it works on the nvidia pkg:

md5sums=('cf40656600b8a587e82a801f05fa2d95')
[ "$CARCH" = "x86_64" ] && md5sums=('c9827059697001fa61518e56fdc24e93')
build() {

But also fails on something like:

md5sums=('cf40656600b8a587e82a801f05fa2d95')
[ "$CARCH" = "x86_64" ] && md5sums=('c9827059697001fa61518e56fdc24e93')
[ "$CARCH" = "any" ] && md5sums=('su8270596970033456t965676fdc24e9')

The CARCH check is also an assumption (that nobody will write it any other way), so that's bad. I guess here there isn't any solution, since there a number of possibilities (and assumptions to be made).

I'd say -G would be disqualified, given the above :)
Comment by Xavier (shining) - Wednesday, 14 October 2009, 12:24 GMT
To be honest, I do not understand this sed line :
sed -i".bak" "/md5sums=(/,/)$/d" PKGBUILD

I don't know what /,/ is, how it works, and why it fails in the CARCH case.
Maybe someone could come up with a safer sed command (or another tool) that really only deletes the checksums and nothing else.

What we want to achieve is to strip following regexp (which can be spread on multiple lines...) : {md5,sha1,...}sums=([^)]*)

We cannot implement -G until we have a good and safe way to do this.

Edit : I just realized now that the original report included a different sed command. I will try it and report back :)
Comment by Ray Rashif (schivmeister) - Wednesday, 14 October 2009, 12:41 GMT
I tried the one here:

sed "/md5sums=([^)]*)/d"

But it had trouble with multiple lines or more copies of sums.

For the one in the patches:

sed "/PATTERN1/,/PATTERN2/d" - Delete only part containing first match PATTERN1 and continue (across more than one line if needed) until PATTERN2

I have no idea why it's not working in the above cases either.
Comment by Xavier (shining) - Wednesday, 14 October 2009, 19:16 GMT
I must be stupid but the one in the original report does not work for me :
$ sed -ne '1h;1!H;${;g;s/md5sums=([^)]*)/'$(makepkg -g 2>/dev/null)'/g;p;}' PKGBUILD
sed: -e expression #1, char 73: unterminated `s' command

The best one so far actually seems to be the awk one that Pierre shared :
{ rm PKGBUILD; awk '$0 ~ /^md5sums/ {i = 1; system("makepkg -g 2>/dev/null")}; !i {print}; $0 ~ /\)/ {i = 0}' > PKGBUILD; } < PKGBUILD

And it seems this is best run outside of makepkg.. So well, I am giving up with the idea to include this inside makepkg.
Comment by Tomas Mudrunka (harvie) - Wednesday, 14 October 2009, 23:20 GMT
I think that it can be at least implemented as unsupported feature to makepkg with warning that it will not work for scripts that have more complex hashes or it can be disabled for PKGBUILDs with more than one checksum array.
Comment by Ray Rashif (schivmeister) - Wednesday, 14 October 2009, 23:41 GMT
Thomas: The patch already does that (the first one gives you the warning you want; you can path it directly to your /usr/bin/makepkg if you really want to see it in action), and the problem isn't about number/type of sums, those are fine. It's the compatibility with all use cases and multi-architecture sums that matter, and there is no way to ensure proper behaviour. Since it can also be done by running your own (rather small) local script, we shouldn't even bother implementing this.

And..I've never heard of "implementing" an "unsupported" feature. IIRC that's called "tainting" =p
Comment by Gavin Bisesi (Daenyth) - Thursday, 15 October 2009, 02:40 GMT
Since there's no reliable way to do it, I think this should be done in a script other than makepkg if it's done at all.
Comment by Tomas Mudrunka (harvie) - Thursday, 15 October 2009, 11:29 GMT
Ray: when you have "multi-architecture sums that matter" you will probably have multiple checksum arrays in PKGBUILD... so we are able to detect it that way. or we can make some standart for multi-architecture checksuming, so we will be able to update the hashes properly.
Comment by Ray Rashif (schivmeister) - Thursday, 15 October 2009, 13:04 GMT
Yes, correct. But we don't have anything solid to safely match and ignore/replace those yet. I'm giving up on this too but anyone's free to brainstorm :)
Comment by Tomas Mudrunka (harvie) - Thursday, 15 October 2009, 19:25 GMT
what about adding something special into the PKGBUILD in order to enable this feature?

md5sums=('update' 'cf40656600b8a587e82a801f05fa2d95')
[ "$CARCH" = "x86_64" ] && md5sums=('update:x86_64' 'c9827059697001fa61518e56fdc24e93')
[ "$CARCH" = "any" ] && md5sums=('update:any' 'su8270596970033456t965676fdc24e9')
Comment by Tomas Mudrunka (harvie) - Thursday, 15 October 2009, 19:28 GMT
or maybe another possibility is that you'll need enable this using something like options=(updatesums) and there will be warning about multi-arch packages in documentation of this option...
Comment by Xavier (shining) - Thursday, 15 October 2009, 19:34 GMT

Loading...