AUR web interface

**This is the bug tracker for the AUR web interface.**

Use this tracker to report bugs or make feature requests regarding the behaviour or implementation of the AUR software.
Please read the Reporting Bug Guidelines before filing a new task.
http://wiki.archlinux.org/index.php/Reporting_Bug_Guidelines

- Please report bugs related to Arch Linux official packages here: http://bugs.archlinux.org/index.php?project=1
- Please report bugs for [community] packages here: http://bugs.archlinux.org/index.php?project=5
- For any packages in the AUR contact the maintainer or leave a comment on the package's detail page.

Source Code:
https://projects.archlinux.org/aurweb.git/
Tasklist

FS#61605 - AUR web: Comments with Unicode characters are silently discarded

Attached to Project: AUR web interface
Opened by Alberto Salvia Novella (es20490446e) - Friday, 01 February 2019, 23:31 GMT
Last edited by Allan McRae (Allan) - Friday, 01 February 2019, 23:38 GMT
Task Type Bug Report
Category General
Status Unconfirmed
Assigned To No-one
Architecture All
Severity Low
Priority Normal
Reported Version git
Due in Version Undecided
Due Date Undecided
Percent Complete 0%
Votes 1
Private No

Details

HOW TO REPRODUCE:
- In an AUR package page add a comment with an Unicode pictograph (https://getemoji.com/)

RESULT:
- The comment is silently discarded.
This task depends upon

Comment by Eli Schwartz (eschwartz) - Sunday, 03 February 2019, 00:38 GMT
That actually sounds like a really cool idea, but unfortunately as far as I can tell, this works fine. I haven't tested on aur.archlinux.org as I have nothing to comment anywhere at the moment and no real interest in bothering people with junk comments just to test this -- but I've trialled it on a local test instance of the aurweb codebase, and I can submit comments with whatever sort of unicode I want.

Comment by Alberto Salvia Novella (es20490446e) - Sunday, 03 February 2019, 03:31 GMT
In the real website it doesn't work. And the comments don't appear, so testing on the web itself has no consequences:
https://youtu.be/M0UlMpA-7pY
Comment by Eli Schwartz (eschwartz) - Sunday, 03 February 2019, 05:02 GMT
I've redacted your offtopic irrelevant attempt at derailing this bug report, and I strongly encourage you to stop picking fights over pacman development in unrelated bug reports. Assuming you know what's good for you.

On the topic of this bug report: if the bug report is correct, there must be something different about the AUR that makes this not work in production -- but the only difference that makes sense is I'm using sqlite and the server is using mariadb. As far as I know mariadb should support unicode just fine, but digging around, the settings look a bit odd:

>>> import aurweb.db
>>> from pprint import pprint
>>> conn = aurweb.db.Connection()
>>> cur = conn.execute("SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%'")
>>> pprint(cur.fetchall())
[('character_set_client', 'utf8mb4'),
('character_set_connection', 'utf8mb4'),
('character_set_database', 'utf8'),
('character_set_filesystem', 'binary'),
('character_set_results', 'utf8mb4'),
('character_set_server', 'utf8mb4'),
('character_set_system', 'utf8'),
('collation_connection', 'utf8mb4_general_ci'),
('collation_database', 'utf8_general_ci'),
('collation_server', 'utf8mb4_general_ci')]

I will punt to lfleischer on this. IIRC mysql is weird about utf8 which really isn't unless you use the mb4 version.... So it sounds like in order to support annoying people who use unicode emoji in order to communicate serious messages, we might need to change some of these from utf8 to utf8mb4? This would be a database level problem...

I know unicode currently works for most users, at least to the extent that, say, Chinese can be correctly inserted. But those use 3-byte utf8, not 4-byte characters...

Loading...