FS#23010 - Integrate git with the aur.
Attached to Project:
AUR web interface
Opened by Thomas Dziedzic (tomd123) - Tuesday, 22 February 2011, 18:53 GMT
Last edited by Lukas Fleischer (lfleischer) - Tuesday, 09 June 2015, 07:20 GMT
Opened by Thomas Dziedzic (tomd123) - Tuesday, 22 February 2011, 18:53 GMT
Last edited by Lukas Fleischer (lfleischer) - Tuesday, 09 June 2015, 07:20 GMT
|
Details
It would be great if the aur commited package updates to a
git, or better yet, if the aur was a collection of git repos
(one for each package).
One example of this is: http://pkgs.fedoraproject.org/gitweb/ This would allow secure uploading, no worries about tarball bombs, multiple maintainers, history, etc. I currently manage an aur git mirror http://pkgbuild.com/git/aur.git/ but I would love it if the aur itself had this feature so that it would make my mirror unnecessary. |
This task depends upon
Closed by Lukas Fleischer (lfleischer)
Tuesday, 09 June 2015, 07:20 GMT
Reason for closing: Implemented
Additional comments about closing: Implemented in 4.0.0-rc1.
Tuesday, 09 June 2015, 07:20 GMT
Reason for closing: Implemented
Additional comments about closing: Implemented in 4.0.0-rc1.
if you use ssl. If there is a problem with the security please file a separate
bug report.
I'm not too thrilled with the idea of having to clone the entire AUR, so it
would have to be smaller separate repos.
What really needs to be explored are the needs that we are trying to fulfill.
Don't get me wrong - I absolutely love git - but it might not make sense to
use it to hammer in this nail.
Setting up git push access for everyone and with granular permissions
per package might be troublesome, even with gitosis. But yeah, I haven't
tried it, so I'm prepared to eat my words. I do know that if a multiple
maintainer feature is implemented in the AUR, it would just come stock
with the AUR and the DB schema, and would come free with the rest
of the AUR setup if anyone did want to maintain a mirrored repo.
I would really encourage you or other community members to try to come up
with these solutions.
The authentication problem is feasible imho, although we need to keep an eye on that...
td123: Can you be a bit more detailed about how the repos should integrate with the AUR? How do users submit packages? How does a user disown/adopt a package? How are comments and notifications managed? Also, how will package validation be done?
"I don't quite understand how git stops zipbombs, or how the AUR is unsecure
if you use ssl. If there is a problem with the security please file a separate
bug report."
I said secure uploading because as I currently see it, we upload packages using http.
Using git over ssh would make this a lot more secure.
Also it would prevent zip bombs because there wouldn't be any files you have to unzip, you would just commit the files you want to git.
"I'm not too thrilled with the idea of having to clone the entire AUR, so it
would have to be smaller separate repos."
I also had seperate package repos in mind. For one, it allows multiple maintainers. Take fedora's example.
"What really needs to be explored are the needs that we are trying to fulfill.
Don't get me wrong - I absolutely love git - but it might not make sense to
use it to hammer in this nail."
It really would fulfill the features I listed earlier. I especially would love the history of a package. For example, right now, I only use my pkgbuild mirror because I can look at realtime changes taking place and also have access to older versions of packages.
"Setting up git push access for everyone and with granular permissions
per package might be troublesome, even with gitosis. But yeah, I haven't
tried it, so I'm prepared to eat my words. I do know that if a multiple
maintainer feature is implemented in the AUR, it would just come stock
with the AUR and the DB schema, and would come free with the rest
of the AUR setup if anyone did want to maintain a mirrored repo."
I think initially it would be a pain, but in the long run it would be a *very* nice feature.
"I would really encourage you or other community members to try to come up
with these solutions."
Normally I wouldn't say this, but I think this is one dilemma that has been solved by fedora (see my link in op).
Again, another solution to this could be that we just don't change anything and do what my aur git mirror does. Which is on every upload it extracts the AUR package and commits the files to a git repo.
"Can you be a bit more detailed about how the repos should integrate with the AUR?"
Well the only thing I would see that would be changed in the frontend would be the way users upload packages.
Instead of using the submit page, they would associate an ssh key with their account (kind of like github) and on the server, would set up appropriate git permissions.
This would keep the original ease of use, except it would require one more step: associating an ssh key with your account.
"How do users submit packages? How does a user disown/adopt a package?"
I think disown could be automated by removing the git permissions for that user for a particular repo.
Adopt would work such that, adopting a package without maintainers would be allowed automatically, adopting a package with multiple maintainers would have to require the maintainers or TUs approval.
"How are comments and notifications managed?"
I would imagine that we could keep our current system.
"Also, how will package validation be done?"
This would probably be the most tricky question to answer at this time.
Maybe we can implement something with git hooks, though I don't know if there are any that are before the commit is made.
Just to reiterate another idea I mentioned to louipc, "another solution to this could be that we just don't change anything and do what my aur git mirror does. Which is on every upload it extracts the AUR package and commits the files to a git repo."
ZIP bombs will still be an issue with Git. Git compresses data as well and needs to uncompress them when checking stuff out. The vulnerability is just not that obvious, plus it's another format (zlib compression and deltas instead of gzip'ed tarballs) which doesn't make any difference tho. As I said before we'd definetly need some quotas *and* some heuristic Git hooks to detect "ZIP bombs" (in a metaphorical sense).
Another question that came into my mind... How do we parse packages? Like extract all the data from the PKGBUILD and put it in the MySQL database? We'll need to do this in some Git hook, too.
I kinda like the idea but it's not that easy to implement. Especially from a security point of view.
Submitting tarballs is done the same way as it's done now, using tarballs. Additionally, there's some text field where users can put submission reasons like "Initial upload.", "Updated $foo package to 1.4.2." or "Added $foobar patch.". After doing the usual tarball validation and PKGBUILD parsing, the tarball is extracted but *not* copied to a directory that is publicly accessible via HTTP. Instead, the incoming packages directory is a tree of Git repositories. After having extracted everything, there are two possibilitys: 1. The submission is an initial upload. In this case, a new Git repo is initialized inside the directory the tarball has been extracted to, first. 2. It's an package update. In this case, nothing special is done. In both cases, `git commit -am "$foo"` is run, "$foo" being the submission message the user typed earlier.
The advantage of this is that a whole bunch of vulnerabilities (XSS, symlink attacks, malicious CGIs) won't affect us. We could just setup some cgit interface to give HTTP access to everything and link to the package repo's cgit URLs in the package details view. The PKGBUILD link will be replaced by a link to the plain cgit preview of the current HEAD's PKGBUILD file. The tarball download link will be replaced by a link to a snapshot tarball of current HEAD's tree (cgit creates these automatically). We could even implement file preview for all source files again, linking to the cgit plain tree views of the files (using current HEAD as commit, of course). The only issue that would remain is that ZIP bomb vulnerability which we already fixed [1] tho.
If everything works fine, we can still think about giving direct push access to the Git repos later. Does that sound nice? :)
[1] http://projects.archlinux.org/aur.git/commit/?id=09d8128f99c2edc27dd81efc63e9b3c797603ca1
* `git rm` all files from the directory (except dotfiles, of course)
* extract the tarball (we'll probably need to reject source tarballs that contain dotfiles to ensure no one overwrites the ".git" directory)
* add all extracted files to the index
* `git commit`
That's a good trade off.
Still has the ease of uploading .tar.gzs from a web interface with history.
Sounds nice :)
FS#15043would make this work really well. The main blocking part of this seems to be that the parsing of the pkgbuild after it's uploaded. With a dedicated .SRCINFO file there won't be any issues with that. Also, the only thing that would have to happen when adding tar files is to extract all non-dotfiles to the right git repository, put the .SRCINFO file there, and run `git commit -a`.* the the parsing of PKGBUILDs takes place on the client end, no issues with bash breaking things.
* make there be an extremely tight quota on packages that can be overridden by special permission (1MB would probably be fine).
* just a post-recieve hook to ensure that there actually is a PKGBUILD and .SRCINFO at least.
* it could use the way mentioned above to add files the normal way (as .src.tar.gz files that also have a tiny upload limit)
Has there been any progress on this issue? What is the blocking part?
A design doc? Code? An interested party to take ownership?
This is volunteer open source development... no one is going to work on this unless they have the time and interest.