FS#9063 - xine performance Issue 64 vs 32 bit build
Attached to Project:
Arch Linux
Opened by Sander Jansen (GogglesGuy) - Monday, 31 December 2007, 19:35 GMT
Last edited by Aaron Griffin (phrakture) - Wednesday, 23 April 2008, 16:23 GMT
Opened by Sander Jansen (GogglesGuy) - Monday, 31 December 2007, 19:35 GMT
Last edited by Aaron Griffin (phrakture) - Wednesday, 23 April 2008, 16:23 GMT
|
Details
Running the latest kernel, I noticed a severe performance
difference between 64 bit and 32 bit. When using xine,
trying stop a paying stream it seems to wait for a mutex to
be released. This releasing of the mutex is significant
slower on my two 64 bit machines, compared to my 32 bit
machine. The two 64 bit machines are a Intel Pentium 4 and a
Intel Core 2 Duo, while the 32 bit machine is a Athlon XP
1600 (5 years old!!).
I've attached a log with some timings (cpu ticks and time). 32 bit seems to be more than 10x faster. A guy on the xine mailinglist mentioned this, which may or may not have anything to do with this: "I noticed that on a previous release of Xine, but It was not a Xine issue but a kernel issue (on my hardware at least). The problem was related to a mutex which took time to release or lock (I didn't remember exactly) sometimes (the problem was not systematic). I tried several kernel (from 2.6.16 to 2.6.20) with several configuration (low latency, BKL or not, 100HZ, 250HZ, 1000HZ), and the only one which didn't suffer this problem was 2.6.17.14 with this : CONFIG_PREEMPT=y # CONFIG_PREEMPT_BKL is not set CONFIG_HZ_250=y CONFIG_HZ=250" |
This task depends upon
Closed by Aaron Griffin (phrakture)
Wednesday, 23 April 2008, 16:23 GMT
Reason for closing: Fixed
Additional comments about closing: Fixed in xine 1.1.12
Wednesday, 23 April 2008, 16:23 GMT
Reason for closing: Fixed
Additional comments about closing: Fixed in xine 1.1.12
http://dev.archlinux.org/~tpowa/2.6.24/
I've tried rc-6, the timings didn't seem to improve:
(mind you, I used my other 64 bit machine this time to check the timings):
ticks: 1533992227 time: 0.350000
ticks: 1430852572 time: 0.490000
ticks: 1387090612 time: 0.420000
ticks: 1396862580 time: 0.420000
ticks: 1253362432 time: 0.400000
ticks: 1580758372 time: 0.390000
ticks: 1186466767 time: 0.400000
ticks: 1657419000 time: 0.550000
ticks: 2170108147 time: 0.680000
ticks: 2563092405 time: 0.810000
ticks: 1072197525 time: 0.360000
ticks: 2003402902 time: 0.650000
ticks: 2149056405 time: 0.670000
ticks: 672847357 time: 0.220000
ticks: 2741860237 time: 0.800000
The new timings are on average 0.02s (32bit) vs 0.70s (64bit)
ticks: 13366938 time: 0.010000
ticks: 36534152 time: 0.000000
ticks: 34689161 time: 0.000000
ticks: 30580979 time: 0.010000
ticks: 38078827 time: 0.010000
ticks: 36933739 time: 0.010000
The timings of 64bit seems worse than before:
ticks: 775073295 time: 0.530000
ticks: 1286406621 time: 0.920000
ticks: 458499132 time: 0.320000
ticks: 1180204587 time: 0.830000
ticks: 766595115 time: 0.560000
ticks: 905110875 time: 0.650000
ticks: 859795344 time: 0.630000
ticks: 1374765759 time: 0.980000
ticks: 1314767367 time: 0.900000
http://bugs.xine-project.org/show_bug.cgi?id=33
Basically the demuxer thread in xine unlocks a mutex for a short time to allow other threads to lock that mutex. I'm thinking that on multi core cpu's the sched_yield waiting time is too short to allow the other thread to lock the mutex.
So in short, I think the kernel is ok (unless sched_yield is broken ofcourse :P) , it's not a 64 vs 32 bit issue, but more a single vs multi-core issue. The fix I proposed on the xine mailinglist (and which seems to work for me) is not to use sched_yield, but rather use a sleep instead.