Arch Linux

Please read this before reporting a bug:
https://wiki.archlinux.org/index.php/Reporting_Bug_Guidelines

Do NOT report bugs when a package is just outdated, or it is in Unsupported. Use the 'flag out of date' link on the package page, or the Mailing List.

REPEAT: Do NOT report bugs for outdated packages!
Tasklist

FS#37006 - [python2] horrible performance

Attached to Project: Arch Linux
Opened by Carlos (memeplex) - Saturday, 21 September 2013, 07:17 GMT
Last edited by Jan de Groot (JGC) - Friday, 06 December 2013, 08:10 GMT
Task Type Bug Report
Category Packages: Extra
Status Closed
Assigned To Angel Velasquez (angvp)
Architecture i686
Severity High
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 1
Private No

Details

Comparison against a custom, built from sources with default options, python:

Custom:

Python 2.7.5 (default, Sep 21 2013, 03:54:47)
[GCC 4.8.1 20130725 (prerelease)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import time
>>> t=time.time(); import numpy; print time.time()-t
1.06899690628
>>> t=time.time(); import pandas; print time.time()-t
1.86351799965

Arch:

Python 2.7.5 (default, Sep 6 2013, 09:59:46)
[GCC 4.8.1 20130725 (prerelease)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import time
>>> t=time.time(); import numpy; print time.time()-t
1.51807308197
>>> t=time.time(); import pandas; print time.time()-t
2.81632208824

Both suck, I agree, but the first sucks less. The numbers are more or less consistent across different runs. The site-packages are the same for both versions (the ones provided by arch in /usr/lib/python2.7/site-packages.

This task depends upon

Closed by  Jan de Groot (JGC)
Friday, 06 December 2013, 08:10 GMT
Reason for closing:  Fixed
Comment by Carlos (memeplex) - Saturday, 21 September 2013, 07:31 GMT
Seems to be a problem with module compilation. If you run python -v, then import numpy you get stuff like the following:

import numpy.ma # from /usr/lib/python2.7/site-packages/numpy/ma/__init__.py
# can't create /usr/lib/python2.7/site-packages/numpy/ma/__init__.pyc
# /usr/lib/python2.7/site-packages/numpy/ma/core.pyc has bad mtime
import numpy.ma.core # from /usr/lib/python2.7/site-packages/numpy/ma/core.py
# can't create /usr/lib/python2.7/site-packages/numpy/ma/core.pyc
# /usr/lib/python2.7/site-packages/numpy/ma/extras.pyc has bad mtime
import numpy.ma.extras # from /usr/lib/python2.7/site-packages/numpy/ma/extras.py
# can't create /usr/lib/python2.7/site-packages/numpy/ma/extras.pyc

You see the pattern, lots of "can't create *.pyc".

Then you run as root, bytecode compilation gets done and import is in much better shape:

Python 2.7.5 (default, Sep 6 2013, 09:59:46)
[GCC 4.8.1 20130725 (prerelease)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import time
>>> t=time.time(); import numpy; print time.time()-t
0.371647834778
>>>

Comment by Paul Bredbury (brebs) - Saturday, 21 September 2013, 11:05 GMT
Probably want e.g.:

cd /usr/lib/python2.7/site-packages && { python -m compileall . ; python -O -m compileall . ; }

It seems to be common to create both .pyc and .pyo files.
Comment by Carlos (memeplex) - Monday, 23 September 2013, 14:00 GMT
  • Field changed: Percent Complete (100% → 0%)
How is that x5 importing time is not a bug? There is a problem with the .pyc files in the package, bad modification dates or whatever. I'm not supposed to compile all every time I update, am I?
Comment by Paul Bredbury (brebs) - Monday, 23 September 2013, 21:59 GMT Comment by Carlos (memeplex) - Monday, 23 September 2013, 22:00 GMT
To reproduce:

sudo pacman -Sy python2

python2 -v -c 'import numpy' # not as root!
Comment by Carlos (memeplex) - Monday, 23 September 2013, 22:01 GMT
Yes, I'm bytecompiling after install with compileall too, as a workaround.
Comment by Angel Velasquez (angvp) - Monday, 23 September 2013, 22:01 GMT
I don't have numpy installed (now yes, for reproducing).. did test with another libraries and didn't reproduce it at all, with other libraries not happen at all, digging deeper.
Comment by Carlos (memeplex) - Monday, 23 September 2013, 22:05 GMT
It has nothing to do with numpy, it even happens with the base packages.
Comment by PT. Ma. (BOYPT) - Thursday, 14 November 2013, 04:36 GMT
$ stat -c "%n: %Y" /usr/lib/python2.7/os.py*
/usr/lib/python2.7/os.py: 1378454192
/usr/lib/python2.7/os.pyc: 1378454179
/usr/lib/python2.7/os.pyo: 1378454185

os.py is newer than both pyc and pyo, that's how it triggered python's recompile mechanism.
Comment by Felix Yan (felixonmars) - Thursday, 14 November 2013, 04:44 GMT
So I find this a bug in numpy packaging, which runs sed -i after setup.py install, which made those .py files has a greater mtime than .pyc/.pyo.

This also applies for the 'python2' package itself, as it did the same thing in PKGBUILD:

# clean up #!s
find "${pkgdir}"/usr/lib/python${_pybasever}/ -name '*.py' | \
xargs sed -i "s|#[ ]*![ ]*/usr/bin/env python$|#!/usr/bin/env python2|"
Comment by Felix Yan (felixonmars) - Thursday, 14 November 2013, 07:51 GMT
I made a patch for python2's PKGBUILD to sed the .py files in prepare(), with a hack to fix the build process.
Comment by Carlos (memeplex) - Friday, 06 December 2013, 05:09 GMT
I have reported this to the numpy packager. This bug must be top priority I would say. There's a lot of stuff in the system silently running python, this isn't a concern just for python programmers. I opened it more than two months ago.

Loading...