FS#49457 - [libxml2] python3 bindings are broken

Attached to Project: Arch Linux
Opened by Abdó Roig-Maranges (abdo) - Tuesday, 24 May 2016, 09:13 GMT
Last edited by Jan de Groot (JGC) - Thursday, 26 May 2016, 11:21 GMT
Task Type Bug Report
Category Packages: Extra
Status Closed
Assigned To Jan de Groot (JGC)
Architecture All
Severity Medium
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 2
Private No

Details

Description:

python3 bindings are broken again.


Additional info:

* libxml2 2.9.3+0+gbdec218


Steps to reproduce:

$ python -c 'import libxml2'
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/usr/lib/python3.5/site-packages/libxml2.py", line 9326
XML_CHAR_ENCODING_ASCII XML_T
^
SyntaxError: invalid syntax
This task depends upon

Closed by  Jan de Groot (JGC)
Thursday, 26 May 2016, 11:21 GMT
Reason for closing:  Fixed
Additional comments about closing:  Fixed in package using workarounds in PKGBUILD.
Comment by Giancarlo Razzolini (grazzolini) - Tuesday, 24 May 2016, 11:28 GMT
I can confirm this. The bindings indeed are not working for python3. Downgrading the package solves the issue. But I'm suspecting a upstream issue.
Comment by Abdó Roig-Maranges (abdo) - Tuesday, 24 May 2016, 11:46 GMT
Well yes, most probably this is an upstream thing, but since this package tracks upstream git already... it should at least track non-broken commits.

I spent some time trying to understand where the hell this line comes from in the upstream build. This piece of python is auto-generated.

But in the meantime, a fix for this particular thing is a one-liner. just replace the offending line with

XML_CHAR_ENCODING_ASCII = 22

Comment by Abdó Roig-Maranges (abdo) - Tuesday, 24 May 2016, 11:54 GMT
By the way, if anyone wants to tracks this down, my suspicion is that someone parses include/libxml/encoding.h and gets tripped of by the lack of trailing comma on the line for the XML_CHAR_ENCODING_ASCII enum item, since it is the last entry in the enum. The prime suspect would be python/generator.py, but that file has not been touched since 2014, so I have no idea what is going on.
Comment by Jan de Groot (JGC) - Wednesday, 25 May 2016, 09:59 GMT
The i686 bindings seem to work fine.

This is caused by MAKEFLAGS. Everytime I build libxml2 the python bindings are different. In the build logs I see the python class generator is called 3 times in parallel.

Also, the testsuite doesn't work for python 3.x and errors don't seem to be fatal, so this wasn't caught when upgrading libxml2. In fact, this bug could apply to any version of libxml2.
Comment by Ralph Corderoy (RalphCorderoy) - Wednesday, 25 May 2016, 10:47 GMT
Viewing https://www.archlinux.org/packages/extra/i686/libxml2/ and clicking "Bug Reports", top right, visits https://bugs.archlinux.org/?project=1&cat[]=2&string=libxml2 that does not list this bug. Editing to remove the cat... gives https://bugs.archlinux.org/?project=1&string=libxml2 that does. That's why I raised a duplicate, and why others might too. Is this a flaw with Flyspray's "Bug Reports" link?
Comment by Abdó Roig-Maranges (abdo) - Wednesday, 25 May 2016, 10:55 GMT
@jgc Then the bindings are much more messed up than I was thinking! Can we just do `make -j1` on the PKGBUILD? Would that help?
Comment by Jan de Groot (JGC) - Wednesday, 25 May 2016, 12:39 GMT
disabling makeflags doesn't help. It removes the 3-time parallel generation of files, but so far I haven't been able to generate python bindings that are the same on two makepkg invocations. IMHO a generator should produce the same output every time.
Comment by Abdó Roig-Maranges (abdo) - Wednesday, 25 May 2016, 13:01 GMT
Ok, I have a theory. The Makefile rule for the generator has multiple targets. It appears that a single invocation of the generator produces all outputs.

Multiple target rules is a very misleading feature of GNU make, because it really produces one identical single-target rule for each one of the targets. You end up having several rules with one target each, but whose command touches all files. This is racy as hell on parallel builds.

If this theory is right, invoking with make -j1 will prevent one generator racing against each other and making a mess.

Then, I imagine that the non-deterministic output may have to do with python data structures (python dicts have no defined order), and would be harmless, unless you care about deterministic builds.
Comment by Jan de Groot (JGC) - Wednesday, 25 May 2016, 13:48 GMT
Found the source of not having reproducible builds: this only applies to the python 3.x bindings. The generator script uses dicts which are random in Python 3.x.
Comment by Abdó Roig-Maranges (abdo) - Wednesday, 25 May 2016, 13:54 GMT
Cool!

Do you plan to send a patch upstream for the build system? If you don't I'll prepare something later. Regarding the dicts... It does not bug me enough to fix it.
Comment by Ralph Corderoy (RalphCorderoy) - Wednesday, 25 May 2016, 14:27 GMT
python/Makefile.am does look buggy.
A fan in of $(GENERATED) depending on a new phony target,
and that then depending on the original dependencies,
will stop make running generator.py multiple times.
Comment by Ralph Corderoy (RalphCorderoy) - Thursday, 26 May 2016, 09:31 GMT
libxml2-2.9.4+0+gbdec218-2 now installed.
Syntax error fixed.

Loading...