FS#11292 - handling multiple source files which have the same name

Attached to Project: Pacman
Opened by G_Syme (G_Syme) - Saturday, 23 August 2008, 08:57 GMT
Last edited by Dan McGee (toofishes) - Wednesday, 27 August 2008, 13:53 GMT
Task Type Bug Report
Category makepkg
Status Closed
Assigned To Xavier (shining)
Architecture All
Severity Medium
Priority Normal
Reported Version 3.2.0
Due in Version 3.2.1
Due Date Undecided
Percent Complete 100%
Votes 1
Private No

Details

Summary and Info:
If you have 2 or more files in the source array of a PKGBUILD that have the same filename, makepkg cannot distinguish between them. It downloads the first one, looks at its cache and thinks it already has the second file, too. The result is that neither both files can be downloaded nor can both be used simultaneously in the build()-function.

Steps to Reproduce:
make a source array like
source=(http://url.for.first.file/TestFile.txt \
http://url.for.second.file/TestFile.txt)

Original BBS Thread:
http://bbs.archlinux.org/viewtopic.php?pid=410585

Suggestions (AKA the Feature Request part):
The only idea I've had so far would work a bit similar to the binding of parameters in a function call.
You could decouple the filename in the source url from the filename under which it would be saved (and hence be accessably in the build()-function and for "makepkg -g"). That would require a new syntax element to tell makepkg to save a file under a given name/alias.

E.g. something like the following (the syntax is of course only a suggestion/idea):
source=(FirstFileAlias.txt: http://url.for.first.file/TestFile.txt \
SecondFileAlias.txt: http://url.for.second.file/TestFile.txt)

Then you could use FirstFileAlias.txt and SecondFileAlias.txt in the build()-function.
If no alias is specified (like in 99.99% of the current PKGBUILDs because it is required so seldom) then makepkg would just interpret that the alias shall be the filename itself. That way, the new syntax would be just an extension to the current one that could be used when the necessity arises.
This task depends upon

Closed by  Dan McGee (toofishes)
Wednesday, 27 August 2008, 13:53 GMT
Reason for closing:  Fixed
Additional comments about closing:  Commit d6f62ba22d5c6da484f4a7f0876b203ad545342a
Comment by Allan McRae (Allan) - Saturday, 23 August 2008, 09:22 GMT
This would also be useful for files with stupid download URLs
Comment by Xavier (shining) - Saturday, 23 August 2008, 10:57 GMT
Having to bother fixing this for 0.01% of the PKGBUILDs is not very motivating...
As you said in the BBS thread, you always have the workaround to download the second file in the build() function manually.
This is ugly, but this case is so extremely rare and stupid...

The problem is also that I can't think about any safe syntax to handle this. In your example, it is ambiguous :
FirstFileAlias.txt: http://url.for.first.file/TestFile.txt
So what do you do here, you split with ":" and get the first column?
Then if you do the same with a normal url :
http://url.for.first.file/TestFile.txt
You will extract the filename "http".

Basically, what you will need is a character that cannot exist in ANY filenames and ANY url. Is that even possible? :P
Unless you have a smarter way to handle this?

In any cases, we will have to waste time and effort on a very atypical issue.
Comment by Xavier (shining) - Saturday, 23 August 2008, 10:59 GMT
I missed what Allan said. This is indeed a very good point which I totally forgot about. This makes fixing this issue more interesting, so propose your ideas.
Comment by Allan McRae (Allan) - Saturday, 23 August 2008, 11:34 GMT
Not foolproof, but how about:
source="http://a.url.com/file.tar.gz -> renamed.tar.gz"

Splitting on " -> "... This is what I thought of when someone asked about how to deal with strange URL. I did not like it then and am still not convinced now.

The only other option I can think of, and which I am against, is having another array with the save names in it.
Comment by G_Syme (G_Syme) - Saturday, 23 August 2008, 12:16 GMT
As I've mentioned in the report, the proposed syntax was totally arbitrary. I don't have much experience in bash syntax, and I've only tried to orient by existing standards like the one for optdepends. It was not meant to be ready to implementation.

As for the character to be used to indicate an alias, the common procedure (at least in theory that I've heard) would be to choose one (of a complete sequence like "->") (that doesn't occur to often in an URL) and have it escaped if it is really needed in the URL.

As always, this is only an idea.
Comment by Dan McGee (toofishes) - Saturday, 23 August 2008, 12:34 GMT
We already use "::" when we are looking through download agents, so is that a candidate? I would rather not have three different delimiters for all the different things.

I think a syntax like this would be OK:
source=("savename1::http://example.com/fileone.txt" "savename2::http://archlinux.org/fileone.txt")

Of course, no one has even ventured beyond the syntax issues yet. Now we have to figure out the output file option for all our downloaders, but we only want to use it sometimes, etc...
Comment by Xavier (shining) - Saturday, 23 August 2008, 12:45 GMT
Maybe that :: delimiter is not so bad.
About figuring out the output file option for all our downloaders, Roman already did with this commit :
f56f7ff39102dab754573b0bc40dbceb5a8ec301 : makepkg: Support for resuming source downloads
Comment by Dan McGee (toofishes) - Saturday, 23 August 2008, 12:48 GMT
Wow, I think I'll put some failsauce on my cereal this morning. For some reason I didn't even draw the connection that the partial file stuff used the output file stuff, even though I was staring at the conf file this morning to see the "::" delimiter. I think we have a full solution here then, we just need to implement it?
Comment by Allan McRae (Allan) - Saturday, 23 August 2008, 12:59 GMT
That solution looks good to me. And with Roman's commit it should be a fairly simple fix.
Comment by Xavier (shining) - Sunday, 24 August 2008, 00:13 GMT

Loading...