Pacman

Welcome to the Pacman bug tracker. Please search the current bugs and feature requests before filing a new one! Use advanced search and select "Search in Comments".

* Please select the correct category and version.
* Write a descriptive summary, background info, and provide a reproducible test case whenever possible.
Tasklist

FS#20056 - [pacman] implement parallel transfers

Attached to Project: Pacman
Opened by Tomas Mudrunka (harvie) - Friday, 02 July 2010, 18:39 GMT
Last edited by Dan McGee (toofishes) - Wednesday, 23 March 2011, 15:44 GMT
Task Type Feature Request
Category General
Status Assigned
Assigned To Dave Reisner (falconindy)
Architecture All
Severity Medium
Priority Normal
Reported Version 3.3.3
Due in Version Undecided
Due Date Undecided
Percent Complete 0%
Votes 38
Private No

Details

Summary and Info:
i'd like to see parallel transfers implemented directly in pacman. we can download two or more packages at the same time and we can also download each package from different mirror. we can also -Sy repo indexeses from all repositories parallely.

i know that there are some wrappers like powerpill, etc... but i don't like them much and i think that such feature should be implemented directly in pacman. pacman can use 3rd party binaries ("XferCommand") to download the packages and other files, so i believe that such commands can be run on background...

but clean output of pacman should be also preserved... eg.: apt-get is able to process more files parallely, but it looks like vomiting 50 litres of some ugly sh*t into the terminal. there should be some nice progress bar showing overall process and not leaving mess on screen or maybe there can be multiple progressbars (one for each running process). anyway the clean look of pacman output should be definitely preserved, but parallel transfers are really usefull feature...

peace
This task depends upon

Comment by Dan McGee (toofishes) - Friday, 02 July 2010, 18:42 GMT
I've never understood the need for this type of thing- are there not mirrors that can keep up with your connection speed? This would not benefit me whatsoever as I always saturate my connection, so chances of me wanting to work on this are pretty close to zero.
Comment by Tomas Mudrunka (harvie) - Friday, 02 July 2010, 19:07 GMT
1.) your isp can already have the file in his cache when you are able to receive it
2.) if you are waiting for one of unresponsive repos (unofficial repos are often down eg.: arch-games, etc...) you can download files (or wait for timeout) from other repos in meantime
2.1) sometimes server just stucks and one of it's processes becomes bit unresponsive, while new connections are running well
3.) today it's bit shame to make FTP/HTTP transfers synchronously. imagine that some stupid server software will be able to serve only to one client at the time and some client will stuck. this can be done in some early-ARPANet laboratory, but now? oh, come on! this is similar. we should not wait for slow people.

imagine this 100% parallelized future with all those big clusters of servers with multiple quantum processors and "distributed everything" creating one big swarm sharing all resources. archlinux is fast and almost ready for future, pacman should be also :-)))
Comment by Allan McRae (Allan) - Thursday, 08 July 2010, 04:47 GMT
Despite the above explanation that confused me more than anything... I can see some use for this coming from a place that tends to be far away for various non-Arch repos.

However, I generally do other things while an update runs so I would have no motivation to implement this either. Also, I think it would be herd to keep our nice output (I agree that apt's output in this respect is horrendous).

I think this is a "patches welcome" type request, as long as the number of parallel downloads was configurable and the output stayed sane.
Comment by Tomas Mudrunka (harvie) - Thursday, 24 March 2011, 15:25 GMT
i can imagine output as:

file: [++++______+++++_____+++_______]
(same file downloaded parallely 3times - maybe from different mirrors specified in pacman.conf)

core/file1, archaudio/file2, arch-games/file3:
[+++++_________][++++++++______][+++___________]
(three different files downloaded parallely - in optimal case downloaded each from different repository)
Comment by Matt Peterson (ricochet1k) - Friday, 22 April 2011, 15:58 GMT
I know there are a few programs, such as Aria2, that are designed to run many simultaneous downloads that can be read from a file, and already have some kind of nice output handling. Adding parallel download directly to pacman might be hard, but how hard would it be to add a ParallelXferCommand option to pacman.conf that would take an output directory and a file with a list of the files to download? That way at least it wouldn't take a pacman wrapper to do parallel downloads.

In fact, I might even be willing to dig into pacman myself and add this functionality.
Comment by Pablo Lezaeta (Jristz) - Tuesday, 26 April 2016, 21:49 GMT
Question, without the hastle of a wrapper with the actual state, is possible just to ad an Xfercommand with the propper intruction to do a paralel download using aria2?
I that is possible with the actual state of pacman, I dont see with the xfercommand example for a paralel download with aria2 cant be added instead of built a wrapper in pacman.
for the other side, the only Thing I see is a correct handling of .part files in pacman to prevent race conditions and make pacman wait until all download operations in xfercommands are finished.
Comment by mauro (lesto) - Sunday, 18 February 2018, 18:20 GMT
looking into it right know, unfortunately the code is quite uncommented and navigating trough the callback is a bit complex.

QUICK HACK SOLUTION - no compilation required:
- run pacman -Sup to get the list of all the package that need to be downloaded
- download them (im thinking on parallel with torrent + webseed)
- run pacman -Syu
- your Xfercommand will copy the package requested from your download directry to the one requested by pacman (download on the spot eventual missing package)

This work, but require some hook before and after pacman (to clean the cache), something a heper could do easily.

BETTER SOLUTION:
the problem seems to be in download_with_xfercommand(), it will be called for every single file and in particualr after a download it will call rename() (linee 277 of conf.c, commit a7dbe4635b4af9b5bd927ec0e7a211d1f6c1b0f2)
Now, it seems like if we don-t add %o, it will not rename or check the file. this mean more hard life for the script , i have no idea how to tell him the download path, but we can add a new paramenter

At this point is our command in Xfercommand taht will report to a daemon to tell the new file to download

finally we need to also add a new parameter command, Xfercommand_wait; if present, AFTER "downloading" all files, pacman will call this command and wait for its execution (classic 0 for no error, and anything else error)

BEST SOLUTION:
if a special setting is created, like "multi_xfer", in download_files() replace the "for(i = files; i; i = i->next) {...)" block is replaced by a call to the executable specified by multi_xfer: parameter are -c "chache_dir", -s "server1,server2,...", and -p "package1,package2". server parameter is the only optional.

This seems the most clean, but maybe I'm missing some side effect of the function we are not calling (for example, running ALPM_EVENT_PKGDOWNLOAD_START and other event for single files, but i guess this is not a big issue as i THINK we can just print the output of "multi_xfer" execuatable!

I'm gonna wait the response from someone expert on the code before staring the modification. The function call should be easy, what I still have to dog into is how to get multi_xfer from pacman.conf/argument list

Loading...