FS#34677 - makepkg use --depth 1
Attached to Project:
Pacman
Opened by taylorchu (taylorchu) - Monday, 08 April 2013, 09:02 GMT
Last edited by Allan McRae (Allan) - Tuesday, 09 April 2013, 00:55 GMT
Opened by taylorchu (taylorchu) - Monday, 08 April 2013, 09:02 GMT
Last edited by Allan McRae (Allan) - Tuesday, 09 April 2013, 00:55 GMT
|
Details
Summary and Info:
depth 1 significantly reduces the data required to clone. it reduces download time by at least 50%. tweeter bootstrap saves 88% vlc saves 87% linux mainline saves 85% The --depth 1 option in git clone: Create a shallow clone with a history truncated to the specified number of revisions. A shallow repository has a number of limitations (you cannot clone or fetch from it, nor push from nor into it), but is adequate if you are only interested in the recent history of a large project with a long history, and would want to send in fixes as patches. so, 1. you can also do the "pull once use forever" 2. you cannot push to that repo anymore (but who really needs to anyway?) Steps to Reproduce: |
This task depends upon
2. yes, you can clone another repo with different tag and commit with depth 1, and this is still going to be faster
3. yes
Allan, you might use many git feature when you package. but in common cases, we just clone and build.
I'm not going to add this, but if someone comes up with a way that this could be selected, then it may be considered.
I do most testing on local development environment(where you do cool stuff with git), and push when it is somewhat ready.
When the code is tagged or certain milestone is reached, the build file (like pkgbuild file) is sent to the build server.
And i love the idea of separation of development environment and build environment.
So really, we might never face debugging issue at pkgbuild stage.
you dont really need ALL git history.
And debugging at the PKGBUILD stage happens a lot when you need to test out critical boot issues. I don't want to install a kernel manually, I would much rather have all of the files tracked by pacman so they can be removed or replaced easily.
Yes, certain times I want to run things outside of the makepkg repo, but then I can just locally clone the bare repo that I have, do some tests, then be done. And doing that saves space because of the hardlinking.
The point of a git package is for development. If you're not going to be using the features of git to work with this, then what's the point?
At least I need an option in the 4.1 git source url to specify what depth level I want.
@William
1. true, but the build server is not the same machine as my dev env.
2. yes, but this is not related.
3. I guess this does not apply. let me reason this.
* if the repo is on my dev machine, I can already do all kinds of testing I want. I dont really need to clone bare repo, because I already have the full mirror repo.
* if we need to build on the server, then usually it will be a clean clone, and this is when --depth 1 shines.
4. the point is that on build server, it waits a long time for cloning the repo. and on build server the only thing that it does is -- build.
2) How is that not related? That's the whole reason why we use makepkg, so we're not just throwing things into our systems.
3) This is very related. Let me reason why you can do all of this well and with low bandwidth using the current method:
* This is your dev machine. You don't need to build on it unless you're testing, so there's no reason to even worry about that stuff. If you already have a clone of the repo, just symlink it to your SRCDEST and get on with working.
* Why are you doing a 'clean clone' (which I'm guessing is deleting the repo then cloning it again for no reason)? It's going to be EXACTLY the same, and if you're worried about the non-tracked files, use `git clean .` in the git repo. It removes every file not tracked by git, if that's what you're worried about, but keeps the repo there so you only have to pull/fetch (you can even use the -n option for git clean to see what would be deleted before doing it).
4) Are you deleting the sources on the build server ever time? If so, stop doing that, let it just update the sources. I just pulled the difference between 3.9-rc5 and 3.9-rc6 for the kernel, and it was ~500kb. If that's taking a 'long time' then you have other problems that will only be made worse by shallow clones.
You're obviously not seeing a productive work flow here:
$ = dev server, % = build server
$ git clone <some source url>:foo.git
% git clone <dev server address>:foo.git
. Now they both have a full clone.
. I update something on the dev server, and want to build on the build server
% cd foo
% git pull <dev server address>:foo.git master
...
...
...
. Say this is the 10th time we've pulled and built, and we want the dir to be clean
% git clean ./
% make
% make check
% makepkg # This is using the <dev server address> in the source array, and the clone you have is symlinked into the SRCDEST. This next part is in makepkg
+% cd $srcdir
+% git clone $SRCDEST/foo
. Hard links save a bunch of space and time as it's local
+% <makepkg stuff...creates a package>
. This spits out 'foo.pkg.tar.xz' and, optionally, cleans up the build dir to save space.
% <moving the package to the dev server if you want to use it there, or building that package there>
$ pacman -U foo.pkg.tar.xz
And bam. Both are using the latest versions of the package, they both have git repositories, and you can do everything you want on them both. (unless one's bare, in which case you can just pull from it)
In that whole thing, your method of a shallow clone would save ~70%, on that first clone, which happens ONCE. Why not just leave the clone you have and clean it up with the tools that git gives you?
Your alternative is to avoid the new VCS syntax and simply write the PKGBUILD with whatever download method you want. We will not implement support for this directly via makepkg, but there's nothing that makepkg does that *prevents* you from doing this yourself.
So, users can export the variable with --depth 1 in .bashrc,
tools like yaourt can set it when calling makepkg (i think this will save lot of time for all the yaourt users, where you really just clone,build,install and remove .git/).
PS: does anybody read comments for closed bugs? :P
Just don't use the source array and manually download the source in prepare(). (i.e. basically the old style of VCS package)
Something like:
source=(mypkg::git_depth_1+https://example-forge.org/user/mypkg.git
instead of:
source=(mypkg::git+https://example-forge.org/user/mypkg.git
Maybe with a simpler name but it was just for the example.
That would allow specific package to save bandwidth, time and pandas.
But there are existing packages that expect the clones to be unshallowed currently, like https://aur.archlinux.org/packages/vim-youcompleteme-git (https://aur.archlinux.org/cgit/aur.git/commit/PKGBUILD?h=vim-youcompleteme-git&id=a5423dd02e56851f04302d9ede980fa4a824400d).
I'm not eager to edit every single PKGBUILD I install on my system just to pass a simple switch to git. That's tedious.
That's why encouraging the maintainers to have a shortcut for --depth=1 would be a good compromise.
A question for some beginners like me that want to maintain an AUR package, what is the long version of:
source=(mypkg::git+https://example-forge.org/user/mypkg.git)
It it simply:
prepare() {
git clone https://example-forge.org/user/mypkg.git
}
Or does the source shortcut do more things?