FS#23010 - Integrate git with the aur.

Attached to Project: AUR web interface
Opened by Thomas Dziedzic (tomd123) - Tuesday, 22 February 2011, 18:53 GMT
Last edited by Lukas Fleischer (lfleischer) - Tuesday, 09 June 2015, 07:20 GMT
Task Type Feature Request
Category Backend
Status Closed
Assigned To Lukas Fleischer (lfleischer)
Architecture All
Severity Medium
Priority Normal
Reported Version 1.8.0
Due in Version 4.0.0
Due Date Undecided
Percent Complete 100%
Votes 29
Private No

Details

It would be great if the aur commited package updates to a git, or better yet, if the aur was a collection of git repos (one for each package).
One example of this is: http://pkgs.fedoraproject.org/gitweb/
This would allow secure uploading, no worries about tarball bombs, multiple maintainers, history, etc.

I currently manage an aur git mirror http://pkgbuild.com/git/aur.git/ but I would love it if the aur itself had this feature so that it would make my mirror unnecessary.
This task depends upon

Closed by  Lukas Fleischer (lfleischer)
Tuesday, 09 June 2015, 07:20 GMT
Reason for closing:  Implemented
Additional comments about closing:  Implemented in 4.0.0-rc1.
Comment by Loui Chang (louipc) - Wednesday, 23 February 2011, 05:45 GMT
I don't quite understand how git stops zipbombs, or how the AUR is unsecure
if you use ssl. If there is a problem with the security please file a separate
bug report.

I'm not too thrilled with the idea of having to clone the entire AUR, so it
would have to be smaller separate repos.

What really needs to be explored are the needs that we are trying to fulfill.
Don't get me wrong - I absolutely love git - but it might not make sense to
use it to hammer in this nail.

Setting up git push access for everyone and with granular permissions
per package might be troublesome, even with gitosis. But yeah, I haven't
tried it, so I'm prepared to eat my words. I do know that if a multiple
maintainer feature is implemented in the AUR, it would just come stock
with the AUR and the DB schema, and would come free with the rest
of the AUR setup if anyone did want to maintain a mirrored repo.

I would really encourage you or other community members to try to come up
with these solutions.
Comment by Lukas Fleischer (lfleischer) - Wednesday, 23 February 2011, 10:47 GMT
Well, as far as I understand, ZIP bombs will no longer be possible as users will no longer submit tarballs but just `git push` their source files instead. We'd still need something like quotas to ensure no one's trashing sigurd and some pre-/post-receive hooks to validate the files pushed via Git tho... I'm just guessing...

The authentication problem is feasible imho, although we need to keep an eye on that...

td123: Can you be a bit more detailed about how the repos should integrate with the AUR? How do users submit packages? How does a user disown/adopt a package? How are comments and notifications managed? Also, how will package validation be done?
Comment by Thomas Dziedzic (tomd123) - Wednesday, 23 February 2011, 16:46 GMT
RE: louipc
"I don't quite understand how git stops zipbombs, or how the AUR is unsecure
if you use ssl. If there is a problem with the security please file a separate
bug report."
I said secure uploading because as I currently see it, we upload packages using http.
Using git over ssh would make this a lot more secure.
Also it would prevent zip bombs because there wouldn't be any files you have to unzip, you would just commit the files you want to git.

"I'm not too thrilled with the idea of having to clone the entire AUR, so it
would have to be smaller separate repos."
I also had seperate package repos in mind. For one, it allows multiple maintainers. Take fedora's example.

"What really needs to be explored are the needs that we are trying to fulfill.
Don't get me wrong - I absolutely love git - but it might not make sense to
use it to hammer in this nail."
It really would fulfill the features I listed earlier. I especially would love the history of a package. For example, right now, I only use my pkgbuild mirror because I can look at realtime changes taking place and also have access to older versions of packages.

"Setting up git push access for everyone and with granular permissions
per package might be troublesome, even with gitosis. But yeah, I haven't
tried it, so I'm prepared to eat my words. I do know that if a multiple
maintainer feature is implemented in the AUR, it would just come stock
with the AUR and the DB schema, and would come free with the rest
of the AUR setup if anyone did want to maintain a mirrored repo."
I think initially it would be a pain, but in the long run it would be a *very* nice feature.

"I would really encourage you or other community members to try to come up
with these solutions."
Normally I wouldn't say this, but I think this is one dilemma that has been solved by fedora (see my link in op).

Again, another solution to this could be that we just don't change anything and do what my aur git mirror does. Which is on every upload it extracts the AUR package and commits the files to a git repo.
Comment by Thomas Dziedzic (tomd123) - Wednesday, 23 February 2011, 17:00 GMT
RE: cryptocrack
"Can you be a bit more detailed about how the repos should integrate with the AUR?"
Well the only thing I would see that would be changed in the frontend would be the way users upload packages.
Instead of using the submit page, they would associate an ssh key with their account (kind of like github) and on the server, would set up appropriate git permissions.
This would keep the original ease of use, except it would require one more step: associating an ssh key with your account.

"How do users submit packages? How does a user disown/adopt a package?"
I think disown could be automated by removing the git permissions for that user for a particular repo.
Adopt would work such that, adopting a package without maintainers would be allowed automatically, adopting a package with multiple maintainers would have to require the maintainers or TUs approval.

"How are comments and notifications managed?"
I would imagine that we could keep our current system.

"Also, how will package validation be done?"
This would probably be the most tricky question to answer at this time.
Maybe we can implement something with git hooks, though I don't know if there are any that are before the commit is made.


Just to reiterate another idea I mentioned to louipc, "another solution to this could be that we just don't change anything and do what my aur git mirror does. Which is on every upload it extracts the AUR package and commits the files to a git repo."
Comment by Lukas Fleischer (lfleischer) - Wednesday, 23 February 2011, 17:54 GMT
td123: How's Git over SSH more secure than HTTPs and web based authentication? Except that you can use public key authentication instead of plain passwords?

ZIP bombs will still be an issue with Git. Git compresses data as well and needs to uncompress them when checking stuff out. The vulnerability is just not that obvious, plus it's another format (zlib compression and deltas instead of gzip'ed tarballs) which doesn't make any difference tho. As I said before we'd definetly need some quotas *and* some heuristic Git hooks to detect "ZIP bombs" (in a metaphorical sense).

Another question that came into my mind... How do we parse packages? Like extract all the data from the PKGBUILD and put it in the MySQL database? We'll need to do this in some Git hook, too.

I kinda like the idea but it's not that easy to implement. Especially from a security point of view.
Comment by Lukas Fleischer (lfleischer) - Thursday, 24 February 2011, 08:50 GMT
Ok, I pondered over this idea for another time. How about the following approach:

Submitting tarballs is done the same way as it's done now, using tarballs. Additionally, there's some text field where users can put submission reasons like "Initial upload.", "Updated $foo package to 1.4.2." or "Added $foobar patch.". After doing the usual tarball validation and PKGBUILD parsing, the tarball is extracted but *not* copied to a directory that is publicly accessible via HTTP. Instead, the incoming packages directory is a tree of Git repositories. After having extracted everything, there are two possibilitys: 1. The submission is an initial upload. In this case, a new Git repo is initialized inside the directory the tarball has been extracted to, first. 2. It's an package update. In this case, nothing special is done. In both cases, `git commit -am "$foo"` is run, "$foo" being the submission message the user typed earlier.

The advantage of this is that a whole bunch of vulnerabilities (XSS, symlink attacks, malicious CGIs) won't affect us. We could just setup some cgit interface to give HTTP access to everything and link to the package repo's cgit URLs in the package details view. The PKGBUILD link will be replaced by a link to the plain cgit preview of the current HEAD's PKGBUILD file. The tarball download link will be replaced by a link to a snapshot tarball of current HEAD's tree (cgit creates these automatically). We could even implement file preview for all source files again, linking to the cgit plain tree views of the files (using current HEAD as commit, of course). The only issue that would remain is that ZIP bomb vulnerability which we already fixed [1] tho.

If everything works fine, we can still think about giving direct push access to the Git repos later. Does that sound nice? :)

[1] http://projects.archlinux.org/aur.git/commit/?id=09d8128f99c2edc27dd81efc63e9b3c797603ca1
Comment by Lukas Fleischer (lfleischer) - Thursday, 24 February 2011, 10:05 GMT
Ah, `git commit -a` is not enough, of course. We'll rather need to:

* `git rm` all files from the directory (except dotfiles, of course)
* extract the tarball (we'll probably need to reject source tarballs that contain dotfiles to ensure no one overwrites the ".git" directory)
* add all extracted files to the index
* `git commit`
Comment by Thomas Dziedzic (tomd123) - Saturday, 26 February 2011, 18:38 GMT
This is currently the way I update my git mirror: http://pastie.org/private/jfgeofla9z9jwagmh3uw
Comment by Thomas Dziedzic (tomd123) - Sunday, 27 February 2011, 19:53 GMT
"If everything works fine, we can still think about giving direct push access to the Git repos later. Does that sound nice? :)"

That's a good trade off.
Still has the ease of uploading .tar.gzs from a web interface with history.
Sounds nice :)
Comment by Markus Unterwaditzer (untitaker) - Wednesday, 11 July 2012, 12:44 GMT
How about a simpler feature which allows us to let the AUR backend pull from external Git repositories?
Comment by KaiSforza (KaiSforza) - Tuesday, 15 January 2013, 05:29 GMT
I think that combining this with falconindy's solution for  FS#15043  would make this work really well. The main blocking part of this seems to be that the parsing of the pkgbuild after it's uploaded. With a dedicated .SRCINFO file there won't be any issues with that. Also, the only thing that would have to happen when adding tar files is to extract all non-dotfiles to the right git repository, put the .SRCINFO file there, and run `git commit -a`.

* the the parsing of PKGBUILDs takes place on the client end, no issues with bash breaking things.
* make there be an extremely tight quota on packages that can be overridden by special permission (1MB would probably be fine).
* just a post-recieve hook to ensure that there actually is a PKGBUILD and .SRCINFO at least.
* it could use the way mentioned above to add files the normal way (as .src.tar.gz files that also have a tiny upload limit)
Comment by Marcy Latham (sexyshortages) - Tuesday, 28 January 2014, 16:19 GMT
Hey guys, it's nearly three years since the first request :-(

Has there been any progress on this issue? What is the blocking part?
Comment by Dave Reisner (falconindy) - Tuesday, 28 January 2014, 16:27 GMT
> What is the blocking part?
A design doc? Code? An interested party to take ownership?

This is volunteer open source development... no one is going to work on this unless they have the time and interest.
Comment by GI Jack (GI_Jack) - Tuesday, 12 August 2014, 21:35 GMT
I like the idea, voted added.

Loading...