FS#20416 - {archweb} Implement architecture differences

Attached to Project: Arch Linux
Opened by Pierre Schmitz (Pierre) - Monday, 09 August 2010, 09:05 GMT
Last edited by Dan McGee (toofishes) - Wednesday, 08 September 2010, 15:32 GMT
Task Type Feature Request
Category Web Sites
Status Closed
Assigned To Dan McGee (toofishes)
Architecture All
Severity Low
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

It would be nice to have a table within our dashboard to see if there are different versions of packages for each architecture.

Once upon a time I wrote a proof of concept which lasted longer than anticipated. ;-) But the best place for this should be the dev section of archlinux.org

My sample implementation can be found at https://www.archlinux.de/?page=ArchitectureDifferences and its source at https://git.archlinux.de/www.archlinux.de.git/tree/pages/ArchitectureDifferences.php

Maybe we are able to optimize the sql query but in the end we might still need to cache the result or prepare the table when reporead is run.

The idea behind this is as follows:
* i represents a table with i686 and x with x86_64 packages (I hardcoded the id somehow, ignore that)
* version is the version read from the db files. In archweb's schema this will be concat(pkgver,-,pkgrel)
* Make three queries and join them:
1) get all packages with same pkgname and same repo but different architecture and version
2) get all i686 packages for which there is not the same pkgname in the same repo and x86_64 architecture
3) do the same the other way round

Once these data are collected we can optionally hide differences that only differ in "sub" pkgrel (like 1 vs 1.1) or hide lib32 packages in community.
This task depends upon

Closed by  Dan McGee (toofishes)
Wednesday, 08 September 2010, 15:32 GMT
Reason for closing:  Implemented
Additional comments about closing:  Added links, next deploy
Comment by Pierre Schmitz (Pierre) - Monday, 09 August 2010, 09:09 GMT
Additional comment: I don't know much about django or even python to be of much help here but I could drop some ideas. My guess is that we cannot use the ORM abstraction due to performance issues and have to use SQL directly. And we most likely cannot perform this query live because it takes several seconds. Perhaps reporead could create a table with the result so that a simple django model can read that data directly.
Comment by Dan McGee (toofishes) - Monday, 09 August 2010, 12:09 GMT
Yes, I've been wanting to do this at some point; thanks for the links.
Comment by Dan McGee (toofishes) - Wednesday, 25 August 2010, 19:17 GMT
Performance issues: not even close to being a problem if you structure the query right.

Preview of what I have for now:
http://www.archlinux.org/packages/differences/

And how it is done:
http://projects.archlinux.org/archweb.git/commit/?id=ae5483c230d08c65d91eb7cece106b4f13a56232
Comment by Pierre Schmitz (Pierre) - Saturday, 28 August 2010, 12:42 GMT
Good idea. I didn't thought that doing the sorting and comparing on the "client" would speed it up that much.

Is "AND p.id != q.id" really needed? If that is the primary key they cannot be the same if arch_id is already different.

If you move up the pkgver and pkgrel comparison into the left join condition the "q.id IS NULL" should not be needed anymore as it is implicit for left joins.

PS: Is that url the final one? I could then add a redirect to it from my diff page.
Comment by Dan McGee (toofishes) - Saturday, 28 August 2010, 15:33 GMT
Did you try your suggested changes? The p.id != q.id could be dropped, but being explicit doesn't hurt since it doesn't affect the query plan. However, pushing the pkgver and pkgrel into the ON condition doesn't work at all; 7468 rows are returned instead of 184. It has to be outside the ON clause because we need filtering on both tables, not just one, and having no where clause causes the q side to not be restricted at all.

Yes, unless someone objects to that URL, I think that is a decent spot for it. Maybe don't take yours down yet though or just writeup here a quick list of things not implemented yet- .y pkgrel filtering, screen out multilib, etc.
Comment by Dan McGee (toofishes) - Wednesday, 08 September 2010, 05:58 GMT
Check it out now, I just rolled out more filtering. Going to close this.
Comment by Eric Belanger (Snowman) - Wednesday, 08 September 2010, 15:06 GMT
You forgot to add a link on the home page. Currently you need to know the url to access it.

Otherwise looks good.

Loading...