FS#7485 - use bsdtar in makepkg
Attached to Project:
Pacman
Opened by Baptiste Daroussin (bapt) - Thursday, 21 June 2007, 09:25 GMT
Last edited by Dan McGee (toofishes) - Monday, 09 July 2007, 04:32 GMT
Opened by Baptiste Daroussin (bapt) - Thursday, 21 June 2007, 09:25 GMT
Last edited by Dan McGee (toofishes) - Monday, 09 July 2007, 04:32 GMT
|
Details
Summary and Info:
pacman depends on libarchive which provides bsdtar. bsdtar can handle nearly all the format needed by makepkg and is always installed as it comes as a dependency for pacman. I've made a patch to be able to use bsdtar instead of unzip in makepkg. I tested it on serval packages and it works great. the interest is on uniq tool to rely on for uncompressing sources, less dependencies (gnutar, unzip would not be necessary any more) also it removes the unziphack. Here are two patches : one that only removes unzip but keep gnutar : makepkg-nounzip.patch and one that only uses bsdtar : makepkg-bsdtar.patch (it also seems to be a little - very little :) - bit faster.) patch are done on makepkg delivered with pacman-3.0.5-2 |
This task depends upon
Closed by Dan McGee (toofishes)
Monday, 09 July 2007, 04:32 GMT
Reason for closing: Implemented
Additional comments about closing: implemented in GIT
Monday, 09 July 2007, 04:32 GMT
Reason for closing: Implemented
Additional comments about closing: implemented in GIT
This is not an issue in recent git versions because .FILELIST is generated by ls now (after "allow it to create an empty package" patch).
Also, it seems this solves the complaint about adding unzip to makepkg's depends, because bsdtar handles this as I see.
it seems to work for me, but it need testing as I don't have installed the whole pacman git (I need to keep the stable one)
all decompression that can be done with bsdtar are done with the command bsdtar -x -f (no need for z or j or anything else as bsdtar can automatically detect the archive type, ie tar.gz, tar.bz2, cpio, zip)
I remove the unzip hack
And I use bsdtar instead of tar for creating the package, like this makepkg only rely on bsdtar as a (de)compression program).
For the record bsdtar can also manage the following formats : iso9660, and ar files I think iso9660 is useless for makepkg, but ar could be interesting for debian packages, for programs that are only available to download as rpm or deb for examples we currently have the openoffice langpacks that use rpm, so depends on rpmextract.sh and are also available as deb, so we could drop the rpmextract.sh dependency :) but I think this should be a new bug anyway.
I mean as soon as the packages will rebuild at least once, xdelta would be working again. as xdelta works for creating patch/patching bsdtar archives.
tar: usr/bin/
bsdtar/find: usr/bin
Some testing needs done with pacman to see if it has any negative effects.
.FILELIST should be created with tar everything else can be done with bsdtar.
1- pacman 3.0.5 comes with libarchive2, so nothing to wait
2- the .FILELIST in the last official git is no more created with tar but with find. so your git doesn't seem up to date.
If there is a problem with the the creation of the .FILELIST, I think the best is to write our own tiny c program to create it, better than faking compression I think. I also know that zsh with print -l **/* gives exactly the same output as tar cvf /dev/null *, perhaps there is something in bash that can do it.
I prefer the homemade tiny program like mkfilelist because it is easy to do, to maintain, and would be always sure of the way .FILELIST is created.
2. The switch to find needs reverted back to tar. I don't see the point in creating a program to create .FILELIST when tar does the job perfectly and is already included in the base system.
pacman uses libarchive which comes with bsdtar so I think we do not need gnutar in base as bsdtar can not the job.
Considering the .FILELIST I want to test some stuf. Which git version should I consider the official one ? in case I would propose patches for the FILELIST creation. Currently I use the one in projetcs.archlinux/git, which uses find to create the package list, would it be reverted to tar, or are you open to a new proposition for this job. perhaps there is a better place to discuss about it ? I'm quite new to archlinux, but I really love it and want to get involve in it.
With regard to which git repo to use projects.archlinux is the official one, Dan McGee (one of the pacman devs) has a repo at http://code.toofishes.net/gitweb.cgi?p=pacman.git;a=summary and I've got a repo (I'm not a dev but I do contribute a bit) at http://neptune-one.homeip.net/cgi-bin/gitweb.cgi?p=pacman;a=summary
1. What do we lose by removing any dependency on tar? I would rather go 100% or not at all. My first thought of something missing- the xdelta/rsync tar offset stuff that allows these programs to create smaller binary diffs. We don't use this option at the moment, although I think we should.
2. How can we create the filelist without using GNU tar? It is silly to need such a large program to do a small task, when we could easily write one ourselves to create the filelist in a consistent fashion.
2)
cd $pkgdir
find -type d | sed 's#$#&/#' >.FILELIST
find ! -type d >>.FILELIST
sort .FILELIST > .FILELIST-sorted
mv .FILELIST{-sorted,}
./foo/bar
./foo/bar/test
While tar cvf did this:
foo/bar/
foo/bar/test
Using find -printf '%P\n' fixes this.
Is there really nothing else simpler than the above command?
This works: find * -exec ls -dp {} \; 2>/dev/null
if not, what about this :
find \( -type d -printf '%P/\n' \) , \( ! -type d -printf '%P\n' \) | sort
Anyway looks like the pwd can be removed from find using -mindepth 1 :
find -mindepth 1 \( -type d -printf '%P/\n' \) , \( ! -type d -printf '%P\n' \) | sort
the hidden files in cwd are printed (eg .PKGINFO)
if .FILELIST is always the first hidden file created, it shouldn't matter.
I wrote a small C that can do it like this:
mkfilelist | sort.
I really don't know much of C, so it is just missing the sorting.
Because creating the .FILELIST is a small but important task, having such a program (very easy to maintain as it is not complicated) would make us sure that the .FILELIST is always well formated.
This program is just an example as the code is very trivial, I'm sure it could be done better, but anyway it shows that's it is easy to do.
A separate program is better for me because we can keep control on it. If in pacman-15.0.40 there are new informations stored in .FILELIST we would just have to adapt mkfilelist instead of searching for a new tool that can do the stuff.
Even if I didn't get it right the first time, it was quite close, and it didn't require much time.
Anything wrong with it?
find -mindepth 1 \( -type d -printf '%P/\n' \) , \( ! -type d -printf '%P\n' \) | sort
I personally prefer using existing tools when possible, instead of creating new ones.
If FILELIST needs to change in the feature, it shouldn't be hard either to find out how to use
the existing tools how we want.
Btw, that little C prog doesn't work properly yet. I admit it would probably be easy to fix it,
but I'm really not sure it's a good idea.
Anyway, I'm not the one to decide :)
./mkfilelist <dir to list>
So we would use it like the following:
mkfilelist pkg/
By the way, some times of these programs were posted on the ML, and this kicks the snot out of them. (Use gcc -O2 -o mkfilelist mkfilelist.c).
Though, I don't think performance is the only thing to take in consideration ;)
Building the filelist is very fast anyway, even with lot of files,
and there wasn't a big difference anyway.
Eg 100 ms for mkfilelist vs 150 ms for find, on a dir with more than 16000 files :)
But well, if you see only advantages about going this way, then why not.
I just find it a bit overkill to have an external and dedicated binary just for this simple task.
vercmp and testpkg live there, and they are on your system.
Though, I find these more useful than this one. For example, I don't think they can be easily emulated in one line using existing tools. ;)
But I can't see any real downsides either going this way, so I guess it's just a matter of preference, nothing important.
1. I used a few possible non-portable functions, although this should be easily fixable.
2. find has built in protection against recursive directory loops, and this simple program does not.
the first one should be easily fixable, the second is not really a problem for us since the program is really specific to package creation which shoud not normaly contains any directory loops.
I agree with Chantry on this one, why create a new program when there's already one that does the job. vercmp is simply a wraper to call the vercmp function from libalpm, it's not possible to use exisiting programs to create a one line alternative. Also the creation of .FILELIST is done only once in makepkg vercmp is used in several places in makepkg and .INSTALL scripts.
find -mindepth 1 \( -type d -printf '%P/\n' \) , \( ! -type d -printf '%P\n' \) 2>/dev/null | sort >.FILELIST
which leads to inclusion of .PKGINFO .INSTALL .CHANGELOG into .FILELIST
Can we just modify makepkg to create .PKGINFO .INSTALL .CHANGELOG in $startdir instead of $startdir/pkg ?
We need the dot files inside pkg/ so they can be included in the package file, otherwise you've got to do extra work to get them included into the package.
If yes, is there any reasons to do so ?
If we always start from a clean pkg/, we won't have all these .* files for creating the filelist.
If that's the case, maybe it could also delete it first.
Sorry if I'm missing something, I don't know makepkg much.
- check deps
- download source
- check checksums
- build pkg and install to pkg/
- tidy up the package
- create the package
-R skips the first 4 steps and uses the existing pkg/ directory to create the package, this case the pkg/ directory may already contain the dot files from the last time makepkg was run.
./autogen.sh
./configure --prefix=/usr --sysconfdir=/etc --localstatedir=/var
make
Then you can use src/pacman/pacman.static, you'll also need the config file from etc/pacman.conf (it contains some new options that won't be in your current pacman.conf).
You can then run `pacman.static --config <config file> ...` to test the new pacman.
so It tested the latest git from http://code.toofishes.net/gitprojects/pacman.git with all my aur (only 5 :)) and it works perfectly. I tested some packages in abs and it works to.
and they looked identical.
Only the order for the second package was a bit different, 3 files in the archive weren't at the same place.
They were at a slightly strange place with bsdtar, but well, I don't see any reason why that would matter, there are many correct orders possible.