Arch Linux

FS#9063 - xine performance Issue 64 vs 32 bit build

Attached to Project: Arch Linux
Opened by Sander Jansen (GogglesGuy) - Monday, 31 December 2007, 19:35 GMT
Last edited by Aaron Griffin (phrakture) - Wednesday, 23 April 2008, 16:23 GMT
Task Type Bug Report
Category Kernel
Status Closed
Assigned To Aaron Griffin (phrakture)
Architecture All
Severity Critical
Priority Normal
Reported Version 2007.08-2
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 1
Private No


Running the latest kernel, I noticed a severe performance difference between 64 bit and 32 bit. When using xine, trying stop a paying stream it seems to wait for a mutex to be released. This releasing of the mutex is significant slower on my two 64 bit machines, compared to my 32 bit machine. The two 64 bit machines are a Intel Pentium 4 and a Intel Core 2 Duo, while the 32 bit machine is a Athlon XP 1600 (5 years old!!).

I've attached a log with some timings (cpu ticks and time). 32 bit seems to be more than 10x faster. A guy on the xine mailinglist mentioned this, which may or may not have anything to do with this:

"I noticed that on a previous release of Xine, but It was not a Xine
issue but a kernel issue (on my hardware at least). The problem was
related to a mutex which took time to release or lock (I didn't remember
exactly) sometimes (the problem was not systematic). I tried several
kernel (from 2.6.16 to 2.6.20) with several configuration (low latency,
BKL or not, 100HZ, 250HZ, 1000HZ), and the only one which didn't suffer
this problem was with this :
   timings (2.5 KiB)
This task depends upon

Closed by  Aaron Griffin (phrakture)
Wednesday, 23 April 2008, 16:23 GMT
Reason for closing:  Fixed
Additional comments about closing:  Fixed in xine 1.1.12
Comment by Sander Jansen (GogglesGuy) - Monday, 31 December 2007, 19:37 GMT
Note that timings measure how long it takes to return from the call 'xine_close(stream)'
Comment by Tobias Powalowski (tpowa) - Thursday, 03 January 2008, 18:24 GMT
try one of the rc kernels instead:
Comment by Tobias Powalowski (tpowa) - Friday, 11 January 2008, 10:05 GMT
is the rc kernel working or not?
Comment by Sander Jansen (GogglesGuy) - Friday, 11 January 2008, 14:15 GMT
Sorry, I haven't had time to compile a custom kernel yet. I'll try to do it this weekend.
Comment by Dale Blount (dale) - Monday, 14 January 2008, 20:10 GMT
You shouldn't have to compile anything, tpowa linked to binary packages.
Comment by Sander Jansen (GogglesGuy) - Monday, 14 January 2008, 20:39 GMT
Ok, you're right, how stupid of me.
I've tried rc-6, the timings didn't seem to improve:
(mind you, I used my other 64 bit machine this time to check the timings):

ticks: 1533992227 time: 0.350000
ticks: 1430852572 time: 0.490000
ticks: 1387090612 time: 0.420000
ticks: 1396862580 time: 0.420000
ticks: 1253362432 time: 0.400000
ticks: 1580758372 time: 0.390000
ticks: 1186466767 time: 0.400000
ticks: 1657419000 time: 0.550000
ticks: 2170108147 time: 0.680000
ticks: 2563092405 time: 0.810000
ticks: 1072197525 time: 0.360000
ticks: 2003402902 time: 0.650000
ticks: 2149056405 time: 0.670000
ticks: 672847357 time: 0.220000
ticks: 2741860237 time: 0.800000
Comment by Tobias Powalowski (tpowa) - Saturday, 26 January 2008, 16:06 GMT
status on .24 kernel in testing?
Comment by Sander Jansen (GogglesGuy) - Sunday, 27 January 2008, 01:37 GMT
I've upgraded both my 32 bit desktop and 64 bit laptop to kernel 2,6.24-2. There is still that major performance difference.
The new timings are on average 0.02s (32bit) vs 0.70s (64bit)
Comment by Sander Jansen (GogglesGuy) - Sunday, 27 January 2008, 01:42 GMT
On closer look the timings for 32 bit actually improved over 2.6.23:
ticks: 13366938 time: 0.010000
ticks: 36534152 time: 0.000000
ticks: 34689161 time: 0.000000
ticks: 30580979 time: 0.010000
ticks: 38078827 time: 0.010000
ticks: 36933739 time: 0.010000

The timings of 64bit seems worse than before:
ticks: 775073295 time: 0.530000
ticks: 1286406621 time: 0.920000
ticks: 458499132 time: 0.320000
ticks: 1180204587 time: 0.830000
ticks: 766595115 time: 0.560000
ticks: 905110875 time: 0.650000
ticks: 859795344 time: 0.630000
ticks: 1374765759 time: 0.980000
ticks: 1314767367 time: 0.900000
Comment by Tobias Powalowski (tpowa) - Sunday, 27 January 2008, 20:25 GMT
please report this issue on the xine bugtracker, i talked with one of the devs there, he is interested in debugging it on xine bugtracker
Comment by Sander Jansen (GogglesGuy) - Sunday, 27 January 2008, 21:40 GMT
I've submitted a bug report on the xine-bugtracker:
Comment by Sander Jansen (GogglesGuy) - Friday, 22 February 2008, 22:42 GMT
Ok, I haven't heard anything from the xine developers (they seem to be asleep, or never got my email on their mailing list). I think this issue is related to the use of sched_yield in the xine demuxer code.

Basically the demuxer thread in xine unlocks a mutex for a short time to allow other threads to lock that mutex. I'm thinking that on multi core cpu's the sched_yield waiting time is too short to allow the other thread to lock the mutex.

So in short, I think the kernel is ok (unless sched_yield is broken ofcourse :P) , it's not a 64 vs 32 bit issue, but more a single vs multi-core issue. The fix I proposed on the xine mailinglist (and which seems to work for me) is not to use sched_yield, but rather use a sleep instead.
Comment by Sander Jansen (GogglesGuy) - Tuesday, 25 March 2008, 15:36 GMT
xine-lib 1.1.11 should have this issue fixed.
Comment by Sander Jansen (GogglesGuy) - Wednesday, 23 April 2008, 16:19 GMT
xine-lib 1.1.12 in extra fixes this issues. You may close this bug report. Thanks!