FS#34677 - makepkg use --depth 1

Attached to Project: Pacman
Opened by taylorchu (taylorchu) - Monday, 08 April 2013, 09:02 GMT
Last edited by Allan McRae (Allan) - Tuesday, 09 April 2013, 00:55 GMT
Task Type Feature Request
Category makepkg
Status Closed
Assigned To No-one
Architecture All
Severity Low
Priority Normal
Reported Version 4.1.0
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 23
Private No

Details

Summary and Info:
depth 1 significantly reduces the data required to clone. it reduces download time by at least 50%.
tweeter bootstrap saves 88%
vlc saves 87%
linux mainline saves 85%

The --depth 1 option in git clone:

Create a shallow clone with a history truncated to the specified
number of revisions. A shallow repository has a number of limitations
(you cannot clone or fetch from it, nor push from nor into it), but is
adequate if you are only interested in the recent history of a large
project with a long history, and would want to send in fixes as
patches.


so,
1. you can also do the "pull once use forever"
2. you cannot push to that repo anymore (but who really needs to anyway?)

Steps to Reproduce:
This task depends upon

Closed by  Allan McRae (Allan)
Tuesday, 09 April 2013, 00:55 GMT
Reason for closing:  Won't implement
Comment by Allan McRae (Allan) - Monday, 08 April 2013, 09:07 GMT
Can you clone another branch from this checkout? A different tag or commit? Cherry-pick from another branch? All features I actually use in PKGBUILDs, especially when bisecting.
Comment by taylorchu (taylorchu) - Monday, 08 April 2013, 09:22 GMT
1. no, you cannot clone from depth 1 repo
2. yes, you can clone another repo with different tag and commit with depth 1, and this is still going to be faster
3. yes

Allan, you might use many git feature when you package. but in common cases, we just clone and build.
Comment by Allan McRae (Allan) - Monday, 08 April 2013, 09:28 GMT
Why use git packages if you don't want to do any debugging with them at any stage?

I'm not going to add this, but if someone comes up with a way that this could be selected, then it may be considered.
Comment by taylorchu (taylorchu) - Monday, 08 April 2013, 09:47 GMT
this might be personal choice.
I do most testing on local development environment(where you do cool stuff with git), and push when it is somewhat ready.
When the code is tagged or certain milestone is reached, the build file (like pkgbuild file) is sent to the build server.
And i love the idea of separation of development environment and build environment.

So really, we might never face debugging issue at pkgbuild stage.
Comment by taylorchu (taylorchu) - Monday, 08 April 2013, 09:51 GMT
if you are really concerned about this, you can use --depth 30 (maybe?)
you dont really need ALL git history.
Comment by Dan McGee (toofishes) - Monday, 08 April 2013, 13:11 GMT
No thanks to even adding this- I don't want some other packager deciding I don't want a full clone.
Comment by KaiSforza (KaiSforza) - Monday, 08 April 2013, 15:59 GMT
There is no reason to do this 'separation of development environment and build environment.' makepkg can follow symbolic links, so there's no reason to clone things twice. If you already have the repository, just use it for both.

And debugging at the PKGBUILD stage happens a lot when you need to test out critical boot issues. I don't want to install a kernel manually, I would much rather have all of the files tracked by pacman so they can be removed or replaced easily.

Yes, certain times I want to run things outside of the makepkg repo, but then I can just locally clone the bare repo that I have, do some tests, then be done. And doing that saves space because of the hardlinking.

The point of a git package is for development. If you're not going to be using the features of git to work with this, then what's the point?
Comment by taylorchu (taylorchu) - Monday, 08 April 2013, 17:34 GMT
@Dan
At least I need an option in the 4.1 git source url to specify what depth level I want.

@William
1. true, but the build server is not the same machine as my dev env.
2. yes, but this is not related.
3. I guess this does not apply. let me reason this.
* if the repo is on my dev machine, I can already do all kinds of testing I want. I dont really need to clone bare repo, because I already have the full mirror repo.
* if we need to build on the server, then usually it will be a clean clone, and this is when --depth 1 shines.
4. the point is that on build server, it waits a long time for cloning the repo. and on build server the only thing that it does is -- build.
Comment by KaiSforza (KaiSforza) - Monday, 08 April 2013, 19:40 GMT
1) Okay, if you've got it on your 'dev machine' then just create the package there, and it seems like you're building a lot, so it shouldn't take that long at all to just fetch a simple update to the source.
2) How is that not related? That's the whole reason why we use makepkg, so we're not just throwing things into our systems.
3) This is very related. Let me reason why you can do all of this well and with low bandwidth using the current method:
* This is your dev machine. You don't need to build on it unless you're testing, so there's no reason to even worry about that stuff. If you already have a clone of the repo, just symlink it to your SRCDEST and get on with working.
* Why are you doing a 'clean clone' (which I'm guessing is deleting the repo then cloning it again for no reason)? It's going to be EXACTLY the same, and if you're worried about the non-tracked files, use `git clean .` in the git repo. It removes every file not tracked by git, if that's what you're worried about, but keeps the repo there so you only have to pull/fetch (you can even use the -n option for git clean to see what would be deleted before doing it).
4) Are you deleting the sources on the build server ever time? If so, stop doing that, let it just update the sources. I just pulled the difference between 3.9-rc5 and 3.9-rc6 for the kernel, and it was ~500kb. If that's taking a 'long time' then you have other problems that will only be made worse by shallow clones.

You're obviously not seeing a productive work flow here:
$ = dev server, % = build server
$ git clone <some source url>:foo.git
% git clone <dev server address>:foo.git
. Now they both have a full clone.
. I update something on the dev server, and want to build on the build server
% cd foo
% git pull <dev server address>:foo.git master
...
...
...
. Say this is the 10th time we've pulled and built, and we want the dir to be clean
% git clean ./
% make
% make check
% makepkg # This is using the <dev server address> in the source array, and the clone you have is symlinked into the SRCDEST. This next part is in makepkg
+% cd $srcdir
+% git clone $SRCDEST/foo
. Hard links save a bunch of space and time as it's local
+% <makepkg stuff...creates a package>
. This spits out 'foo.pkg.tar.xz' and, optionally, cleans up the build dir to save space.
% <moving the package to the dev server if you want to use it there, or building that package there>
$ pacman -U foo.pkg.tar.xz

And bam. Both are using the latest versions of the package, they both have git repositories, and you can do everything you want on them both. (unless one's bare, in which case you can just pull from it)

In that whole thing, your method of a shallow clone would save ~70%, on that first clone, which happens ONCE. Why not just leave the clone you have and clean it up with the tools that git gives you?
Comment by Radka Kopackova (hsn10) - Saturday, 03 August 2013, 19:29 GMT
I agree with taylorchu. Its very useful for build servers. What about to implement it as option? It will keep both parties happy.
Comment by Dave Reisner (falconindy) - Saturday, 03 August 2013, 19:37 GMT
> What about to implement it as option?
Your alternative is to avoid the new VCS syntax and simply write the PKGBUILD with whatever download method you want. We will not implement support for this directly via makepkg, but there's nothing that makepkg does that *prevents* you from doing this yourself.
Comment by mark (mmm) - Wednesday, 21 August 2013, 23:20 GMT
I agree with passing a flag/param. How about make it look like : git clone $_MAKEPKG_VCS_PARAMS ...
So, users can export the variable with --depth 1 in .bashrc,
tools like yaourt can set it when calling makepkg (i think this will save lot of time for all the yaourt users, where you really just clone,build,install and remove .git/).

PS: does anybody read comments for closed bugs? :P
Comment by Allan McRae (Allan) - Wednesday, 21 August 2013, 23:49 GMT
That breaks the second checkout that is done by makepkg when "extracting sources".

Just don't use the source array and manually download the source in prepare(). (i.e. basically the old style of VCS package)
Comment by tuxayo (tuxayo) - Friday, 07 November 2014, 18:39 GMT
Does providing a method for the packager to explicitly use "--depth 1" could help?

Something like:
source=(mypkg::git_depth_1+https://example-forge.org/user/mypkg.git
instead of:
source=(mypkg::git+https://example-forge.org/user/mypkg.git

Maybe with a simpler name but it was just for the example.

That would allow specific package to save bandwidth, time and pandas.
Comment by Daniel Hahler (blueyed) - Sunday, 01 May 2016, 13:28 GMT
It is easy to unshallow a shallow clone, if you need Git's features for developing on a package: `git fetch --unshallow`.

But there are existing packages that expect the clones to be unshallowed currently, like https://aur.archlinux.org/packages/vim-youcompleteme-git (https://aur.archlinux.org/cgit/aur.git/commit/PKGBUILD?h=vim-youcompleteme-git&id=a5423dd02e56851f04302d9ede980fa4a824400d).
Comment by Marcin Mielniczuk (marmistrz) - Thursday, 01 September 2016, 08:20 GMT
Please add a supported option to makepkg.conf allowing to specify the depth manually. Not all people have fast and unlimited Internet.
Comment by Allan McRae (Allan) - Thursday, 01 September 2016, 10:34 GMT
We will NEVER add support for that. You can download the source using --depth=1 in the prepare() function.
Comment by Marcin Mielniczuk (marmistrz) - Thursday, 01 September 2016, 11:32 GMT
Please tell me what is so wrong about giving the end-user the RIGHT TO CHOOSE?
Comment by Allan McRae (Allan) - Thursday, 01 September 2016, 11:47 GMT
It seems even when they have a choice pointed out to them, they still can't choose. So I treat all users like idiots.
Comment by Marcin Mielniczuk (marmistrz) - Thursday, 01 September 2016, 15:08 GMT
But why punish those who can choose? If they don't know what to do, let them simply leave the default options.

I'm not eager to edit every single PKGBUILD I install on my system just to pass a simple switch to git. That's tedious.
Comment by tuxayo (tuxayo) - Sunday, 11 September 2016, 19:44 GMT
I agree that giving the choice to the end user wouldn't work because many PKGBUILDs would break.

That's why encouraging the maintainers to have a shortcut for --depth=1 would be a good compromise.

A question for some beginners like me that want to maintain an AUR package, what is the long version of:

source=(mypkg::git+https://example-forge.org/user/mypkg.git)

It it simply:

prepare() {
git clone https://example-forge.org/user/mypkg.git
}

Or does the source shortcut do more things?

Loading...