FS#7816 - get source from svn/Subversion URLs

Attached to Project: Pacman
Opened by Devin J. Pohly (djpohly) - Tuesday, 14 August 2007, 18:20 GMT
Last edited by Allan McRae (Allan) - Friday, 10 August 2012, 04:55 GMT
Task Type Feature Request
Category makepkg
Status Closed
Assigned To Aaron Griffin (phrakture)
Architecture All
Severity Low
Priority Normal
Reported Version 3.0.5
Due in Version 4.1.0
Due Date Undecided
Percent Complete 100%
Votes 2
Private No

Details

Summary and Info:
Since some projects don't provide tarballs, I'd like to see makepkg have the ability to build a package directly from the latest Subversion repository. (Other VCSs could be added too: Git and CVS come readily to mind.) This might be done just by adding a Subversion URL to the source array, or perhaps by creating another array for source repositories.

Steps to Reproduce:
1. Add a Subversion repository to the source array in PKGBUILD, or perhaps to a new repository array.
2. Run makepkg.

Actual Results:
3. It doesn't know what the heck you're trying to do.

Expected Results:
3. It does know what the heck you're trying to do.
4. It checks out the source or updates an existing repository in the src directory.


I wield some decent bash-fu and would be happy to implement this with a bit of guidance.


Other Thoughts:
* SVN-over-HTTP URLs can't be readily recognized, so we couldn't put them verbatim into the source array. Either we could prefix all Subversion sources with 'svn:', or we could have a separate svnsource array that is handled specially.
* We would need to allow the PKGBUILD to specify the local directory name for the repository, so that if more than one is checked out, we don't have a conflicting "trunk" directory. Again, we could append ':local/path/or/name' to the URL in the source array, or we could allow for this in a separate array as above.
* Repositories would need to be exempt from the MD5 check.
* It would be cool if we could get a revision number (a la svnversion) as part of the pkgver or pkgrel. This could be complicated, though, if there is more than one repository checked out.
This task depends upon

Closed by  Allan McRae (Allan)
Friday, 10 August 2012, 04:55 GMT
Reason for closing:  Implemented
Additional comments about closing:  https://projects.archlinux.org/pacman.gi t/commit/?id=024bc44a
Comment by Xavier (shining) - Tuesday, 14 August 2007, 23:08 GMT Comment by Devin J. Pohly (djpohly) - Wednesday, 15 August 2007, 15:38 GMT
Not until about an hour after I filed this...

I still feel like the current way of doing repo builds is a bit of a hack. Downloading the source doesn't seem like it should be a responsibility of the build() function, and I always found that confusing when trying to use the makepkg -o and -e options to create a custom build. For example, unless you modified the PKGBUILD, you couldn't get makepkg (or, from a brief perusal, versionpkg) to download the sources /and/ the repository without building a package, even though this is what I'd expect makepkg -o to do. It also seems kludgy to put the VCS package in the makedepends array if it's not really required for the build itself.

In addition, if the checkout process is handled by makepkg and not build(), it would be much easier to standardize how it is done, a la the wikipage linked above. Not to mention it would just be a really slick feature. :)
Comment by Dan McGee (toofishes) - Thursday, 23 August 2007, 18:06 GMT
Maybe we could do this by way of a new function called source() or some business. If it existed, makepkg would skip all its usual source downloading techniques and use it instead?

I hesitate to hardcode into makepkg different SCM-fetching schemes. However, this sounds like a great thing to maybe have in /usr/lib/pacman/makepkg/ or something. Source could look something like this:

source () {
/usr/lib/pacman/makepkg/fetchsvn <url> <any other parameters>
}

Thoughts? This would keep complexity out of makepkg while allowing there to be a standard set of scripts, one for each scm, to get the source.
Comment by Devin J. Pohly (djpohly) - Saturday, 25 August 2007, 17:45 GMT
OK, thoughts.

Having a separate script for each SCM makes good sense. I don't see this needing the flexibility of an entire source() function, unless there are some other issues with source-fetching that would require that detailed level of control. But you know better than I would...

What would be great is if each SCM's script was installed with its package, and makepkg then automatically knew it could use that SCM. Sort of a conf.d or profile.d-like directory that is sourced by makepkg, that uses some kind of "hooks" to add functionality.

How about this: an entry in the source array that specifies the SCM to use. Something that is clearly not a URL, e.g. "@svn". Makepkg extracts the name of the SCM into $netfile, uses case to match @*, and calls the corresponding function via fetch_${netfile:1}. Parameters could either be specified with the source (e.g. "@svn,URL,dir-to-checkout-into,user,pass") and split apart, or they could be put in SCM-specific variables (svn_source=(URL1 URL2); svn_user=(user1 user2); svn_pass=(secret1 secret2); ...).

There could be a sourcedepends array for the needed package, or it could just be determined by the success/failure of finding the fetch_xxx function.
Comment by Dan McGee (toofishes) - Tuesday, 20 November 2007, 03:35 GMT
Devin- does the following fix everything you wanted implemented, or would we have to go further (as in not making individual PKGBUILDs fetch their own source)?

http://projects.archlinux.org/git/?p=pacman.git;a=commit;h=8a9c83dd4bffff575a21207248e7acaae5a0d6f9
Comment by Devin J. Pohly (djpohly) - Wednesday, 05 December 2007, 03:10 GMT
My original idea was to keep the fetching code out of the PKGBUILDs altogether, same as the code for fetching regular tarballs--instead, to checkout/update the repository at the same time other sources would be downloaded. This would provide more intuitive behavior when makepkg is run with the -o or -e option, as well as make the code for VCS checkouts easier to standardize and update.

It looks to me like the commit just changes the version number if a VCS source was used... which is a good thing, just not what I had in mind in this RFE.
Comment by Aaron Griffin (phrakture) - Wednesday, 05 December 2007, 17:30 GMT
I'd prefer simply to let versionpkg work as is right now - there's hundreds of scm based PKGBUILDs already, and doing the fetching in the build function is easy and gives us more control.

Think about it, how do we specify the revision in a svn URL? we need custom syntax, and the parsing of it, etc etc. Then once we set this precedent we'd have to support other SCMs too... it's dirty
Comment by Devin J. Pohly (djpohly) - Sunday, 09 December 2007, 22:20 GMT
That's fine. If I come up with something really slick at some point, I'll present it for consideration. Doesn't help that this isn't a really pretty problem in the first place. :)

I prefer the idea of more variables to custom URLs, as I was pondering in comment 4. For example:
sourcedepends=('subversion')
_subversion_source=('http://example.com/svn/repo/' 'http://example.net/svn/repo2')
_subversion_revision=('HEAD' '2006-12-31')
_subversion_user=('guest' 'guest')
_subversion_pass=('' 'guest')

Then, as you mentioned above, the scripts /usr/lib/pacman/makepkg/<whatever> (which would be installed with the SCM) could do the actual work. We'd validate sourcedepends to make sure the script would be there, then run it during the source-fetching phase of makepkg.
Comment by Gavin Bisesi (Daenyth) - Tuesday, 12 August 2008, 14:46 GMT
I think the right way to do this is to have it completely transparent to makepkg, and add it as a DLAGENT in makepkg.conf. At least that's the direction I'm working in. I'm not sure yet whether I will have to modify makepkg. Hopefully I can bang something out by the end of the week.
Comment by Gavin Bisesi (Daenyth) - Tuesday, 12 August 2008, 15:02 GMT
It also seems that if we remove the $_scmrepo and $_scmname variables, then devel_check will break.. I think that perhaps the scripts will also have to have a --check option or so.
Comment by Gavin Bisesi (Daenyth) - Wednesday, 17 September 2008, 16:56 GMT
OK... another issue. How in the world will we handle makepkg -g if we allow git:// and such in the source array?
Comment by Devin J. Pohly (djpohly) - Wednesday, 17 September 2008, 17:21 GMT
As I envision it, SCM-downloaded files wouldn't be subject to integrity checks, because:
a) You can't pin them down. The idea is to have a single PKGBUILD to handle many different revisions of the source code; it defeats the point if we'd have to change the md5sums array every time a source file changes.
b) The SCM client is (or should be) already doing its own integrity check. It's not a straight downloader like Wget or cURL.
c) It would probably be complicated and not fun to program.

How to tell the difference between files that need checked and those that don't will depend on how the download/checkout is implemented, I guess.
Comment by Gavin Bisesi (Daenyth) - Wednesday, 17 September 2008, 17:46 GMT
Yes, that's the problem. The checksum function just loops through the source array. We'd have to add something like
if [ "$file" !~ /^git/ ];
elif [ "$file !` /^svn/ ];

And so on for every handler... which would be fugly as hell. I suppose we could define an array of scm:// things and loop through that for each $file... which would also suck. I don't see any good way to do it... If bash supported hashes then it wouldn't be so hard, but I really have no clue.
Comment by Devin J. Pohly (djpohly) - Wednesday, 17 September 2008, 23:37 GMT
There are really only two options here, right... check or don't check (for -g, generate or don't generate). Rather than blacklisting all the SCMs, couldn't we just whitelist known plain downloads? For example:

case "$file" in
ftp:* | http:*)
checksum_stuff
;;
*)
no_checksum_stuff
;;
esac
Comment by Roman Kyrylych (Romashka) - Sunday, 28 September 2008, 16:59 GMT
the problem is that, for example, git can be fetched via http:// or ssh:// and IIRC bzr can be used with ftp://
Comment by Gavin Bisesi (Daenyth) - Sunday, 28 September 2008, 17:43 GMT
I had thought about that and I think the best way would be to add another DLAGENT as git+http://, git+ssh://. I dunno though, this whole thing would be pretty hard to implement as is... Not sure.
Comment by Gavin Bisesi (Daenyth) - Sunday, 28 September 2008, 17:45 GMT
Hmm... I just had another idea. We could implement it as a makepkg option, add options=(git) for example, and have the existing gitroot variable. What do you people think. Worth it?
Comment by Ciprian Dorin Craciun (ciprian.craciun) - Monday, 19 October 2009, 20:22 GMT
I've made some hacking staring from pacman Git repository. My work is available at (branch patches/git-dlagent):
http://gitorious.org/~ciprian.craciun/pacman/ciprian-craciun-patches

In summary:
* added a script: /usr/bin/git-dlagent that takes exactly two arguments: URL and output.
* updated /etc/makepkg.conf to include the following URL's: git://, git+file://, git+http://, git+https://, git+ssh://, git+rsync;
* a Git url looks-like: git+git://projects.archlinux.org/pacman.git#master##pacman.tar where:
* what is before the first `#` is the URL used to (maybe) clone the Git repository.
* what is after the first `#` and before `##` is the refspec (head / tag / etc.) to export;
* what is after `##` (optional this part is) it can be used to make the file-part of the URL look like a tar file;

How git-dlagent it works:
* in case git+file:// is given `git --git-dir=... archive` is done;
* else it tries to do `git archive --remote=...`; (usually this doesn't work unless the repository is git and the barkend is enabled);
* but anyway it fallbacks to `git clone ...`, and then a local git archive;

Any comments?

P.S.: I've tested the script directly with my installed pacman, and only after that I've thrown the script in the repository. But I think it should work.
Comment by Gavin Bisesi (Daenyth) - Monday, 19 October 2009, 21:12 GMT
Never thought of using git archive. Good idea!
Comment by Ciprian Dorin Craciun (ciprian.craciun) - Tuesday, 20 October 2009, 07:52 GMT
So those present here like my idea, and the direction I'm going to, I could implement similar approaches also for SVN, etc.
(My approach just uses the Git / SVN / etc tools to create a .tar or .zip file, and leave the rest up to makepkg. I've also patched makepkg to ignore those hashes that are equal to `--`.)

Loading...