FS#61179 - Cannot download packages with `+` in the name
Attached to Project:
Pacman
Opened by Paul Davis (dangersalad) - Wednesday, 26 December 2018, 19:42 GMT
Last edited by Allan McRae (Allan) - Sunday, 04 December 2022, 06:30 GMT
Opened by Paul Davis (dangersalad) - Wednesday, 26 December 2018, 19:42 GMT
Last edited by Allan McRae (Allan) - Sunday, 04 December 2022, 06:30 GMT
|
Details
I have a repo set up for my AUR packages and there is an
issue when trying to sync the various `libc++` packages.
It seems that `+` is not escaped properly, so the web server treats them as spaces when looking up the url. |
This task depends upon
Closed by Allan McRae (Allan)
Sunday, 04 December 2022, 06:30 GMT
Reason for closing: No response
Additional comments about closing: Needs more details
Sunday, 04 December 2022, 06:30 GMT
Reason for closing: No response
Additional comments about closing: Needs more details
$ pacman -Slq core extra community| grep -F +
dvd+rw-tools
foobillard++
libsigc++
libsigc++-docs
libstdc++5
libxml++
libxml++-docs
libxml++2.6
libxml++2.6-docs
memtest86+
timidity++
bonnie++
crypto++
dbus-c++
gtk2+extra
ls++
lucene++
mysql++
nicotine+
png++
tolua++
vsqlite++
There are 176 packages in the official repositories which contain a '+' somewhere in the download filename. Including several packages essentially guaranteed to be on all Arch systems.
EDIT: (did not realize this wasn't already being done, I sort of assumed if we weren't yet escaping this the problem would be reported before and against repo packages)
"If a reserved character is found in a URI component and no delimiting role is known for that character, then it must be interpreted as representing the data octet corresponding to that character's encoding in US-ASCII."
For the path portion of a URI, the path component uses a set of sub-delims that includes the "+" symbol. So, this should be percent encoded. The fact that servers do not seem to unescape the + into a space seems to be an implementation decision, and not as per the standard. Moreover, pacman doesn't prescribe any particular URL syntax, and a package could be fetched from a URL such as: "http://repo.com/package?repo=core&package=libxml++" (with a Content-Disposition header containing the diskfile name).
So, pacman is wrong, even if there's plenty of servers where this happens to work.
As of your patch - the CURL documentation states "output" need to be freed with curl_free() function. Using different memory allocators (and using FREE() vs curl_free()) is a bit of PITA. I wonder if it would be better just use the default memory allocator + a simple URL escape function like this one https://stackoverflow.com/a/21491633/576557