FS#61329 - hplip: .pyc files are untracked by pacman

Attached to Project: Arch Linux
Opened by Chris Billington (chrisjbillington) - Wednesday, 09 January 2019, 21:37 GMT
Last edited by Andreas Radke (AndyRTR) - Thursday, 19 March 2020, 06:54 GMT
Task Type Bug Report
Category Packages: Extra
Status Closed
Assigned To Andreas Radke (AndyRTR)
Architecture All
Severity Low
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

Description:

lostfiles reports that .pyc files within /usr/share/hplip/ are not tracked by pacman.

This appears to be due to the hplip package shipping .py files that are not compiled. Since hplip binaries might be run as root to configure printers, the compilation to python bytecode happens at runtime, and writes these .pyc files to disk, but they are untracked by pacman since they were not in the original package.

I would think that these .pyc files ought to be tracked as part of the hplip package, and included in the built package rather than being generated at runtime. This way they will not be left on the system after hplip is uninstalled.

If I understand correctly, Python packages on Arch usually have something like:

python setup.py install --root="$pkgdir/" --optimize=1 --skip-build

in their PKGBUILD, where the option --optimize=1 is compiling the bytecode ahead of time for inclusion the package. Since the python modules/scripts shipped in hplip are not installed using distutils, this option is unavailable, but something similar to the following ought to do the trick:

python -O -m compileall $pkgdir

This will generate optimised bytecode for all .py files in $pkgdir similarly to --optimize=1 with distutils.


Additional info:
* package version(s): hplip 1:3.18.12-1
* config and/or log files etc:
Below is the output of lostfiles showing the pyc files:

/usr/share/hplip/base/pexpect/__pycache__
/usr/share/hplip/base/pexpect/__pycache__/__init__.cpython-37.pyc
/usr/share/hplip/base/__pycache__
/usr/share/hplip/base/__pycache__/avahi.cpython-37.pyc
/usr/share/hplip/base/__pycache__/codes.cpython-37.pyc
/usr/share/hplip/base/__pycache__/device.cpython-37.pyc
/usr/share/hplip/base/__pycache__/g.cpython-37.pyc
/usr/share/hplip/base/__pycache__/__init__.cpython-37.pyc
/usr/share/hplip/base/__pycache__/ldif.cpython-37.pyc
/usr/share/hplip/base/__pycache__/LedmWifi.cpython-37.pyc
/usr/share/hplip/base/__pycache__/logger.cpython-37.pyc
/usr/share/hplip/base/__pycache__/mdns.cpython-37.pyc
/usr/share/hplip/base/__pycache__/models.cpython-37.pyc
/usr/share/hplip/base/__pycache__/module.cpython-37.pyc
/usr/share/hplip/base/__pycache__/os_utils.cpython-37.pyc
/usr/share/hplip/base/__pycache__/password.cpython-37.pyc
/usr/share/hplip/base/__pycache__/pkit.cpython-37.pyc
/usr/share/hplip/base/__pycache__/pml.cpython-37.pyc
/usr/share/hplip/base/__pycache__/services.cpython-37.pyc
/usr/share/hplip/base/__pycache__/six.cpython-37.pyc
/usr/share/hplip/base/__pycache__/sixext.cpython-37.pyc
/usr/share/hplip/base/__pycache__/slp.cpython-37.pyc
/usr/share/hplip/base/__pycache__/smart_install.cpython-37.pyc
/usr/share/hplip/base/__pycache__/status.cpython-37.pyc
/usr/share/hplip/base/__pycache__/strings.cpython-37.pyc
/usr/share/hplip/base/__pycache__/tui.cpython-37.pyc
/usr/share/hplip/base/__pycache__/utils.cpython-37.pyc
/usr/share/hplip/base/__pycache__/validation.cpython-37.pyc
/usr/share/hplip/base/__pycache__/vcard.cpython-37.pyc
/usr/share/hplip/base/__pycache__/wifi.cpython-37.pyc
/usr/share/hplip/fax/__pycache__
/usr/share/hplip/fax/__pycache__/coverpages.cpython-37.pyc
/usr/share/hplip/fax/__pycache__/fax.cpython-37.pyc
/usr/share/hplip/fax/__pycache__/__init__.cpython-37.pyc
/usr/share/hplip/installer/__pycache__
/usr/share/hplip/installer/__pycache__/core_install.cpython-37.pyc
/usr/share/hplip/installer/__pycache__/dcheck.cpython-37.pyc
/usr/share/hplip/installer/__pycache__/__init__.cpython-37.pyc
/usr/share/hplip/installer/__pycache__/pluginhandler.cpython-37.pyc
/usr/share/hplip/prnt/__pycache__
/usr/share/hplip/prnt/__pycache__/cups.cpython-37.pyc
/usr/share/hplip/prnt/__pycache__/__init__.cpython-37.pyc
/usr/share/hplip/prnt/__pycache__/ldl.cpython-37.pyc
/usr/share/hplip/prnt/__pycache__/pcl.cpython-37.pyc
/usr/share/hplip/ui5/__pycache__
/usr/share/hplip/ui5/__pycache__/__init__.cpython-37.pyc
/usr/share/hplip/ui5/__pycache__/plugindialog_base.cpython-37.pyc
/usr/share/hplip/ui5/__pycache__/plugindialog.cpython-37.pyc
/usr/share/hplip/ui5/__pycache__/readonlyradiobutton.cpython-37.pyc
/usr/share/hplip/ui5/__pycache__/setupdialog_base.cpython-37.pyc
/usr/share/hplip/ui5/__pycache__/setupdialog.cpython-37.pyc
/usr/share/hplip/ui5/__pycache__/ui_utils.cpython-37.pyc
/usr/share/hplip/ui5/__pycache__/wifisetupdialog_base.cpython-37.pyc
/usr/share/hplip/ui5/__pycache__/wifisetupdialog.cpython-37.pyc


Steps to reproduce:

Install hplip: sudo pacman -S hplip
Run one of its utilitues as root: sudo hp-setup
It is enough to close the GUI program hp-setup immediately.
Then run lostfiles and see the above .pyc files have been generated but are untracked by pacman
This task depends upon

Closed by  Andreas Radke (AndyRTR)
Thursday, 19 March 2020, 06:54 GMT
Reason for closing:  Fixed
Additional comments about closing:  1:3.20.3-2
Comment by Chris Billington (chrisjbillington) - Wednesday, 09 January 2019, 21:47 GMT
The documentation of the Python compileall module [1] mentions the -d (destdir) option, which controls the prefix for the paths (used in tracebacks) that get compiled into the bytecode. This might be important for ensuring that no references to the build directory are compiled into the files, though the user should never see these unless they delete the corresponding .py files.

Furthermore, it appears most packages in arch contain both optimised and non-optimised pyc files. To get both, compileall needs to run twice, once with the -O option and once without.

Taking all that into account, the following two lines might make sense:

python -O -m compileall -d / "$pkgdir"
python -m compileall -d / "$pkgdir"

[1] https://docs.python.org/3.7/library/compileall.html
Comment by Andreas Radke (AndyRTR) - Thursday, 10 January 2019, 11:51 GMT
Can you try to build with this patch applied to check if it solves the issue:

https://src.fedoraproject.org/rpms/hplip/blob/master/f/hplip-no-write-bytecode.patch
Comment by Eli Schwartz (eschwartz) - Thursday, 10 January 2019, 20:46 GMT
Why is disabling python from writing bytecode to disk considered a better option than generating bytecode in the package() function?

Note that either way python is compiling bytecode, and the only difference is if it caches it for later as a speed improvement... I don't comprehend the Fedora patch at all.
Comment by Chris Billington (chrisjbillington) - Friday, 11 January 2019, 23:26 GMT
After testing, no, the patch doesn't solve the issue. There are two reasons I can see.

One, it only covers a couple of the scripts, there are more scripts with more shebang lines. In particular it doesn't cover hp-setup (symlink to /usr/lib/hplip/setup.py).

The second reason it doesn't work, even if I extend it to include hp-setup, looks like maybe a change in how env works since fedora wrote their patch:

$ sudo hp-setup
/usr/bin/env: ‘python -B’: No such file or directory
/usr/bin/env: use -[v]S to pass options in shebang lines

But changing the shebangs to:

#!/usr/bin/env -S python -B

does work.

I'm ambivalent about which approach to use, but the compileall approach doesn't involve patch files, which is nice, and means the programs run slightly faster due to the libraries being compiled already. a sed command could also be used instead of a patch file, I suppose.

It is a bit odd to compile *everything*, as actually Python does not in general save bytecode for scripts that are run as main. So he compileall approach will create some useless .pyc files. But this is harmless, at least.

I noticed that running the hp-setup GUI tool as root also caused lostfiles to report other .pyc files within /usr/lib/python3.7/site-packages, such as some from PyQt5. It seems there are many packages that don't come with precompiled .pyc files. I wrote a script to find them, and I count about a hundred packages with files living under /usr/lib/python{2.7,3.7}/site-packages/ that lack bytecode. My methodology isn't perfect, it's possible that it is mostly false positives such as test code that will never be imported by anything, certainly not anything running as root. If I extend the search to include all .py files not in /usr/bin, I get ~500 packages that ship without bytecode. Again, could be a lot of false positives. Is this generally considered an issue that should be fixed, i.e. should I report bugs against packages about it? If there really are 500 of them though then 500 bug reports might not be the way to go!
Comment by Eli Schwartz (eschwartz) - Sunday, 13 January 2019, 04:11 GMT
On the topic of the Fedora patch, if it uses /usr/bin/env python -B then it is totally wrong as that is not how the concept of shebangs has ever worked, like ever. Actually, it depends what OS you use, but linux considers the command line to be {'/usr/bin/env', '-S python -B'} and the very recently added in 2018 -S option to env will do some complicated fiddling with the option parsing to reparse the rest of the shebang to implement multiple options.

This patch would thus never work except on alternative operating systems which do parse the shebang with word splitting. But it doesn't matter as Fedora doesn't allow env in a shebang.

Looking at this patchfile and digging into their spec file as well... here is where they sed the shebang and fix the completely invalid env line they used: https://src.fedoraproject.org/rpms/hplip/blob/master/f/hplip.spec#_319
So the patchfile is not even indicative of what they package, there are so many levels of patching going on here...

...

On the topic of general packages which don't package bytecode.

The pyqt5 packages specifically are generated using qmake, which apparently does not do byte-compilation for the project. Usually packages built via setuptools get this for free, other build systems need to implement it on a case-by-case basis.

In general the way to go for this sort of many-packages-need-fixing-for-the-same-thing deal is by building a TODO list: https://www.archlinux.org/todo/

Any package in site-packages is almost surely wrong, packages elsewhere may, as you say, be not a problem due to directly executing the script as __main__
Comment by Chris Billington (chrisjbillington) - Sunday, 27 January 2019, 18:13 GMT
Thanks Eli, I will look into making a TODO list after investigating it more, and for now will report it against pyqt5 separately, perhaps waiting to see how it pans out with hplip and pyqt5 before proposing mass changes.

I've tested for hplip, and the following change to the PKGBUILD appears to be a complete solution. I would say this is a good solution to the problem, unless anyone things I'm overlooking something.

$ git diff
diff --git a/trunk/PKGBUILD b/trunk/PKGBUILD
index 3f97dfc..dc8d896 100644
--- a/trunk/PKGBUILD
+++ b/trunk/PKGBUILD
@@ -73,6 +73,10 @@ package() {
cd $pkgname-$pkgver
make -j1 rulesdir=/usr/lib/udev/rules.d DESTDIR="$pkgdir/" install

+ # compile Python bytecode
+ python -O -m compileall -d / "$pkgdir"
+ python -m compileall -d / "$pkgdir"
+
# remove config provided by sane and autostart of hp-daemon
rm -rf "$pkgdir"/etc/{sane.d,xdg}
install -dm755 ${pkgdir}/etc/sane.d/dll.d
Comment by Chris Billington (chrisjbillington) - Sunday, 27 January 2019, 18:24 GMT
Possibly the following is more future-proof since it excludes /usr/bin, which the package does not put any python scripts in at present (just links), but which it may in the future, and we would not want to byte-compile them. I have also tested this.

# compile Python bytecode
python -O -m compileall -d / "$pkgdir"{/usr/lib,/usr/share}
python -m compileall -d / "$pkgdir"{/usr/lib,/usr/share}
Comment by Andreas Radke (AndyRTR) - Wednesday, 30 January 2019, 06:47 GMT
I guess that affects all supported distributions and even custom installations. So I think this should be better brought to the upstream launchpad tracker.
Comment by Andreas Radke (AndyRTR) - Friday, 06 September 2019, 20:15 GMT
Still waiting for someone reporting this upstream.
Comment by Chris Billington (chrisjbillington) - Friday, 06 September 2019, 20:44 GMT
Thanks for the reminder, I've filed a bug upstream here:

https://bugs.launchpad.net/hplip/+bug/1843100

I'm not totally convinced the fix belongs upstream though, since Arch and other distros will still need to a) patch the makefile or whatnot to make it use the correct Python interpreter when building bytecode, and b) bump the pkgrel and rebuild when the distro starts shipping a new minor Python version. So this makes it 'smell' like a distro-level problem to me. Comparing to other packages, whilst distutils installs regular bytecode, it doesn't install optimised bytecode, so the Arch wiki's python package guidelines suggest using the --optimize flag to install with distutils, and to manually invoke compileall for pip-installed packages which don't compile bytecode at all. So the problem so far seems to be somewhat treated as a distro-level one.

Slightly off topic, but the issue applies to so many packages that I wonder if it would be appropriate for including in namcap a check for Python files outside /usr/bin that lack bytecode. I may look into making a patch for this for namcap.
Comment by Chris Billington (chrisjbillington) - Wednesday, 04 March 2020, 14:32 GMT
Upstream bug hasn't gotten any traction so far.

Another example, the recent firewalld manual intervention was due to resolving this issue:

https://git.archlinux.org/svntogit/community.git/commit/trunk?h=packages/firewalld&id=90ee8c75050382bb6d4d4bfc77eb6c48eeabc5dd


The fix there is better than what I proposed because it sets the correct path for the compiled bytecode (-d /usr/lib)

This package could do the same except for /usr/share:

@@ -88,4 +88,9 @@ package() {

# add mixed license file
install -Dt "${pkgdir}"/usr/share/licenses/${pkgname} -m644 COPYING
+
+ # Compile Python bytecode:
+ python -m compileall -d /usr/share "$pkgdir/usr/share"
+ python -O -m compileall -d /usr/share "$pkgdir/usr/share"
+
}
Comment by Andreas Radke (AndyRTR) - Wednesday, 18 March 2020, 12:29 GMT
I've uploaded 3.20.3-2 with the compiled files in /usr/share though I think they belong better to /usr/lib/python3.8. But there's no simple configure option to place them properly.
If we keep it this way now please share some file conflicts that you will probably hit to write a news draft.
Comment by Chris Billington (chrisjbillington) - Wednesday, 18 March 2020, 15:04 GMT
Thanks! Here's the file conflicts message I get:

$ sudo pacman -U hplip-1_3.20.3-2-x86_64.pkg.tar.zst
loading packages...
resolving dependencies...
looking for conflicting packages...

Packages (1) hplip-1:3.20.3-2

Total Installed Size: 30.58 MiB
Net Upgrade Size: 3.41 MiB

:: Proceed with installation? [Y/n]
(1/1) checking keys in keyring [######################] 100%
(1/1) checking package integrity [######################] 100%
(1/1) loading package files [######################] 100%
(1/1) checking for file conflicts [######################] 100%
error: failed to commit transaction (conflicting files)
hplip: /usr/share/hplip/base/__pycache__/__init__.cpython-38.pyc exists in filesystem
hplip: /usr/share/hplip/base/__pycache__/avahi.cpython-38.pyc exists in filesystem
hplip: /usr/share/hplip/base/__pycache__/codes.cpython-38.pyc exists in filesystem
hplip: /usr/share/hplip/base/__pycache__/device.cpython-38.pyc exists in filesystem
hplip: /usr/share/hplip/base/__pycache__/g.cpython-38.pyc exists in filesystem
hplip: /usr/share/hplip/base/__pycache__/logger.cpython-38.pyc exists in filesystem
hplip: /usr/share/hplip/base/__pycache__/mdns.cpython-38.pyc exists in filesystem
hplip: /usr/share/hplip/base/__pycache__/models.cpython-38.pyc exists in filesystem
hplip: /usr/share/hplip/base/__pycache__/module.cpython-38.pyc exists in filesystem
hplip: /usr/share/hplip/base/__pycache__/os_utils.cpython-38.pyc exists in filesystem
hplip: /usr/share/hplip/base/__pycache__/password.cpython-38.pyc exists in filesystem
hplip: /usr/share/hplip/base/__pycache__/pml.cpython-38.pyc exists in filesystem
hplip: /usr/share/hplip/base/__pycache__/services.cpython-38.pyc exists in filesystem
hplip: /usr/share/hplip/base/__pycache__/six.cpython-38.pyc exists in filesystem
hplip: /usr/share/hplip/base/__pycache__/sixext.cpython-38.pyc exists in filesystem
hplip: /usr/share/hplip/base/__pycache__/slp.cpython-38.pyc exists in filesystem
hplip: /usr/share/hplip/base/__pycache__/status.cpython-38.pyc exists in filesystem
hplip: /usr/share/hplip/base/__pycache__/strings.cpython-38.pyc exists in filesystem
hplip: /usr/share/hplip/base/__pycache__/tui.cpython-38.pyc exists in filesystem
hplip: /usr/share/hplip/base/__pycache__/utils.cpython-38.pyc exists in filesystem
hplip: /usr/share/hplip/base/__pycache__/validation.cpython-38.pyc exists in filesystem
hplip: /usr/share/hplip/base/pexpect/__pycache__/__init__.cpython-38.pyc exists in filesystem
hplip: /usr/share/hplip/installer/__pycache__/__init__.cpython-38.pyc exists in filesystem
hplip: /usr/share/hplip/installer/__pycache__/core_install.cpython-38.pyc exists in filesystem
hplip: /usr/share/hplip/installer/__pycache__/dcheck.cpython-38.pyc exists in filesystem
hplip: /usr/share/hplip/installer/__pycache__/pluginhandler.cpython-38.pyc exists in filesystem
hplip: /usr/share/hplip/prnt/__pycache__/__init__.cpython-38.pyc exists in filesystem
hplip: /usr/share/hplip/prnt/__pycache__/cups.cpython-38.pyc exists in filesystem
hplip: /usr/share/hplip/prnt/__pycache__/ldl.cpython-38.pyc exists in filesystem
hplip: /usr/share/hplip/prnt/__pycache__/pcl.cpython-38.pyc exists in filesystem
Errors occurred, no packages were upgraded.
Comment by Andreas Radke (AndyRTR) - Wednesday, 18 March 2020, 15:15 GMT
Thx, prepared a news draft to adp list.

Loading...