FS#7252 - Arch install 10x as slow as it should be(ftp CDUP/CWD bug)

Attached to Project: Pacman
Opened by Troels Liebe Bentsen (tlbdk) - Wednesday, 23 May 2007, 23:09 GMT
Last edited by Aaron Griffin (phrakture) - Wednesday, 16 January 2008, 19:16 GMT
Task Type Bug Report
Category Backend/Core
Status Closed
Assigned To Aaron Griffin (phrakture)
Architecture All
Severity Low
Priority Normal
Reported Version 3.0.4
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

Why does pacman under the installation(have not tested it after) for each package it downloads, go back to the root of the ftp server and then back to where it was before?

Resulting in at least 10(24 on some mirrors) useless for each file it has to download, and this is only when the root is 3 directories in "/0.8/os/i686". For other this will be (n * 2 + n) where n is the number of directories to enter before getting to the file.

In my case about 9/10 of the install time is waiting for useless ftp commands to finish.

Wireshark trace:
227 Entering Passive Mode (81,170,139,233,40,244)

RETR grub-0.97-7.pkg.tar.gz

150 Opening BINARY mode data connection for grub-0.97-7.pkg.tar.gz (225709 bytes).

226 File send OK.

NOOP

200 NOOP ok.

PWD

257 "/0.8/os/i686"

CDUP

250 Directory successfully changed.

PWD

257 "/0.8/os"

CDUP

250 Directory successfully changed.

PWD

257 "/0.8"

CDUP

250 Directory successfully changed.

PWD

257 "/"

CWD current

250 Directory successfully changed.

CWD os

250 Directory successfully changed.

CWD i686
250 Directory successfully changed.
MODE S

200 Mode set to S.

TYPE I

200 Switching to Binary mode.

SIZE gzip-1.3.12-2.pkg.tar.gz

213 48811

MDTM gzip-1.3.12-2.pkg.tar.gz

213 20070417193737

MODE S

200 Mode set to S.

TYPE I

200 Switching to Binary mode.

PASV

227 Entering Passive Mode (81,170,139,233,44,184)

RETR gzip-1.3.12-2.pkg.tar.gz
This task depends upon

Closed by  Aaron Griffin (phrakture)
Wednesday, 16 January 2008, 19:16 GMT
Reason for closing:  Won't implement
Additional comments about closing:  Symlinks fixed, no longer an issue
Comment by Aaron Griffin (phrakture) - Thursday, 24 May 2007, 15:21 GMT
This is known - the problem is the directory symlinks. PWD reports the non-symlinked directory to FTP, whereas the symlinked dir is listed in the server line.

There are two fixes.

The first is for us to change the dirs on the ftp site and wait for mirrors to sync (we will be doing this in the near future anyway, due to release numbering changes).

The second is to replace your server entries for '[current]' with "0.8/os/i686" instead of "current/os/i686"

PS You can use --debug=2 to dump debug output, which includes the information that wireshark gave you above.
Comment by Troels Liebe Bentsen (tlbdk) - Thursday, 24 May 2007, 15:55 GMT
But my question might be why PWD the directory, if the CWD command succeeded? And why do it after every file?

Looking at other ftp clients lftp, it only CWD once even for multiple transfers, and it does not call the PWD call unless asked to.

What is the reasoning behind pacman having this behavior? And would the best solution not be to stop using PWD, and just behave as lftp?



Comment by Aaron Griffin (phrakture) - Thursday, 24 May 2007, 16:13 GMT
This is by design. libdownload caches connections like that. When a new file is requested, it checks the current directory for the connection to that server. Keep in mine we am not saying "download these 5 files", pacman instead does "download this file, do some stuff, do some more, then download the next file".

Because each file is distinct, the PWD must be checked when opening a cached connection.

I did test this. A single PWD is not causing any slow down whatsoever. The only "issues" are caused by the PWD of the given FTP server reporting something odd due to symlinks. This is a server-side issue, and not an issue with pacman or libdownload.
Comment by Troels Liebe Bentsen (tlbdk) - Wednesday, 30 May 2007, 13:13 GMT
One might argue that it would be the job of libdownload to know in what directory it currently was and that using PWD would not be necessary at all, this is also how most other ftp client's are implemented anyway. I do agree it's a server side issue but as most ftp servers don't agree on how symlinks(Is it covered by the RFC, did they have symlinks back then?) should be reported by PWD, I guess the best fix would still be not to use PWD at all. But as I have yet too look at the code I don't know how easy this would be to fix.

I would love to have a look at the problem instead of just bitching about it :) and try to make a patch for libdownload and pacman in a week or two when I have handed in most of my assignments.
Comment by Aaron Griffin (phrakture) - Wednesday, 30 May 2007, 17:48 GMT
Sure, I guess it could be checked by libdownload, which seems fine to me, but it's not critical in my eyes

svn is here http://phraktured.net/libdownload/

While the functionality seems fine as is (a server side change would fix it) I'm not against a patch to fix this. As a nudge in the right direction, you'd want to set the pwd on each connection structure, and maintain it... it's a little complex, but ftp.c is probably what you want
Comment by Dan McGee (toofishes) - Wednesday, 16 January 2008, 05:23 GMT
Lack of activity = closing soon unless someone speaks up.
Comment by Aaron Griffin (phrakture) - Wednesday, 16 January 2008, 19:16 GMT
Original error is fixed (symlinks)
Everything else is "functioning as intended" in my eyes. At the very least, this is a libdownload issue, and has nothing to do with pacman.

Loading...