FS#11427 - k3b crashes Xorg with intel G33/i915 and xrandr dual head

Attached to Project: Arch Linux
Opened by android (android) - Monday, 08 September 2008, 22:25 GMT
Last edited by Jan de Groot (JGC) - Sunday, 07 June 2009, 09:42 GMT
Task Type Support Request
Category Packages: Extra
Status Closed
Assigned To Jan de Groot (JGC)
Architecture x86_64
Severity Medium
Priority Normal
Reported Version None
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

Description:
Xorg crashes sporadically when running k3b to rip audio CD. Oddly enough ripping some audio CDs seems to cause xorg crashes more than others.

Without running k3b, X server has been up 2 days without crash.

Previously couldn't even start server due to installed library error, see this forum post:
http://bbs.archlinux.org/viewtopic.php?id=54698

The k3b crash occurred with xorg-server from extra and so I moved to the new xorg-server, xf86-video-intel and intel-dri packages in the testing repo.

Crashes continued, so I built k3b from ABS, this did not correct the problem.

If this isn't an xorg server or intel driver bug, it may be a rebuild needed in the Qt or other underlying libraries (wishful thinking?).

Additional info:
* package version(s)
xorg-server 1.4.99.906-3
intel-dri 7.1-2
xf86-video-intel 2.4.2-1

* config and/or log files etc.
xorg.conf is very minimal to support randr

Steps to reproduce:
rip Jeff Beck "Wired" on my new x86_64 workstation 8-)
This task depends upon

Closed by  Jan de Groot (JGC)
Sunday, 07 June 2009, 09:42 GMT
Reason for closing:  Works for me
Additional comments about closing:  Must be a hardware bug.
Comment by android (android) - Friday, 12 September 2008, 03:16 GMT

OK, just a quick update:

I installed the recent xorg-server 1.5.0-1 from testing.

This didn't correct this k3b crash issue.

I also have run without the xrandr stuff on a single monitor. This also does not correct the problem.

So it would seem this bug affects anyone using the intel driver.

Could this be? Wouldn't there be ALOT more complaining on the forums?

I've been monitoring the mailing lists over at xorg.freedesktop.org and it seems there are a number of issues with the intel driver at this point. So much for great stability due to publication of specs.

I attached my current xorg.conf for reference (it's a little comment crazed as I've been tweaking it)

Anyway, it's still AFU over here...

johnea
Comment by android (android) - Friday, 12 September 2008, 03:23 GMT

Ooops, I also meant to mention that in xdpyinfo it shows both XINERAMA and XRANDR extensions. Shoul dboth of these be enabled? I tried the Option "Xinerama" "false" but it didn't seem to matter. -jea

Here's the whole xdpyinfo output:

[johnea@4x ~]$ xdpyinfo
name of display: :0.0
version number: 11.0
vendor string: The X.Org Foundation
vendor release number: 10500000
X.Org version: 1.5.0
maximum request size: 16777212 bytes
motion buffer size: 256
bitmap unit, bit order, padding: 32, LSBFirst, 32
image byte order: LSBFirst
number of supported pixmap formats: 7
supported pixmap formats:
depth 1, bits_per_pixel 1, scanline_pad 32
depth 4, bits_per_pixel 8, scanline_pad 32
depth 8, bits_per_pixel 8, scanline_pad 32
depth 15, bits_per_pixel 16, scanline_pad 32
depth 16, bits_per_pixel 16, scanline_pad 32
depth 24, bits_per_pixel 32, scanline_pad 32
depth 32, bits_per_pixel 32, scanline_pad 32
keycode range: minimum 8, maximum 255
focus: window 0x2400001, revert to PointerRoot
number of extensions: 29
BIG-REQUESTS
Composite
DAMAGE
DOUBLE-BUFFER
DPMS
Extended-Visual-Information
GLX
MIT-SCREEN-SAVER
MIT-SHM
MIT-SUNDRY-NONSTANDARD
RANDR
RENDER
SECURITY
SGI-GLX
SHAPE
SYNC
TOG-CUP
X-Resource
XC-APPGROUP
XC-MISC
XFIXES
XFree86-DGA
XFree86-Misc
XFree86-VidModeExtension
XINERAMA
XInputExtension
XKEYBOARD
XTEST
XVideo
default screen number: 0
number of screens: 1

screen #0:
print screen: no
dimensions: 2560x1024 pixels (752x301 millimeters)
resolution: 86x86 dots per inch
depths (7): 24, 1, 4, 8, 15, 16, 32
root window id: 0x8a
depth of root window: 24 planes
number of colormaps: minimum 1, maximum 1
default colormap: 0x20
default number of colormap cells: 256
preallocated pixels: black 0, white 16777215
options: backing-store NO, save-unders NO
largest cursor: 64x64
current input event mask: 0xda403f
KeyPressMask KeyReleaseMask ButtonPressMask
ButtonReleaseMask EnterWindowMask LeaveWindowMask
KeymapStateMask StructureNotifyMask SubstructureNotifyMask
SubstructureRedirectMask PropertyChangeMask ColormapChangeMask
number of visuals: 3
default visual id: 0x21
visual:
visual id: 0x21
class: TrueColor
depth: 24 planes
available colormap entries: 256 per subfield
red, green, blue masks: 0xff0000, 0xff00, 0xff
significant bits in color specification: 8 bits
visual:
visual id: 0x22
class: DirectColor
depth: 24 planes
available colormap entries: 256 per subfield
red, green, blue masks: 0xff0000, 0xff00, 0xff
significant bits in color specification: 8 bits
visual:
visual id: 0x68
class: TrueColor
depth: 32 planes
available colormap entries: 256 per subfield
red, green, blue masks: 0xff0000, 0xff00, 0xff
significant bits in color specification: 8 bits
Comment by Pierre Schmitz (Pierre) - Friday, 12 September 2008, 05:31 GMT
I don't think this is related to KDE.
Comment by android (android) - Friday, 12 September 2008, 18:24 GMT
Yet more info:

I've also seen a couple of xorg crashes when clicking "Remind Me" on the alarm dialog of jpilot.

Unlike the k3b crashes these do not lock the entire system. I'm dropped back into my login shell and the Xorg.0.log contains the error: (EE) intel(0): First SDVOB output reported failure to sync

I can restart X without rebooting and then even subsequent clicks on the Remind Me button don't lead to further X crashes.

This differs from the k3b induced crashes in that k3b completely freezes the system. The graphical display does not clear to the invoking shell, the mouse locks, CTRL-ALT-BACKSPACE and CTRL-ALT-DEL have no effect and the audio track I'm playing on xmms starts to loop on the last 2 or 3 seconds of audio. I have to power off the system. After a reboot the Xorg.0.log from the crash contains no error message.

I also wanted to add a hardware note: I'm using an ADD2 PIC-E x16 card for the second (DVI) monitor and the built in VGA connector for the first monitor.

Finally, while Pierre is almost certainly correct that this issue doesn't originate in KDE. It is most certainly related to KDE in that k3b CONSISTENTLY crashes the entire computer when certain audio CDs are inserted.

In experimenting I've started k3b from CLI instead of my fluxbox menu and watched it's output prior to crash. While I'm unable to capture the output (since to computer crashes) I believe I saw something to the extent of "invalid pregap" or something indicating it was having some trouble with the above mentioned audio CD.

I'm about to try this again to get a better idea of what the message is, but I wanted to post this first before the system goes down.

So, Pierre, you're not off the hook! 8-)

More info in a bit...

johnea
Comment by android (android) - Friday, 12 September 2008, 19:22 GMT

OK, well after about 6 or 8 crashes and reboots I manage to collect a little more info.

I've attached two files:
One is from "The Beat Farmers" "Loud, Plowed and Live" this CD rips successfully. I'm able to collect the info for this CD from the console and so it's accurately recorded.
Two is from "Jeff Beck" "Wired" which is one of the ones that consistenly crashes the system. This file I've hand edited from the bits left on the screen after the crash. I had to (GASP!) write these down with a pen and paper (it felt sort of PRIMAL, but I did it for the team 8-) So I think the stuff in this one is fairly accurate, but it's not a text capture.

There are a number of things that caught my (untrained) attention:
1) DCOPClient::attachInternal. Attach failed Could not open network socket
- the network is working fine and everybody else is opening sockets
2) kdeui (KAction): WARNING: KActionCollection::operator+=(): function is severely deprecated
- but this occurrs in both the successful and FAILED rips
3) (K3bDevice::HalConnection) initializing HAL >= 0.5
- I was initially getting HAL errors (doesn't EVERYONE KNOW that you have to have HAL in your rc.conf daemons!)
4) (K3bDevice::Device) /dev/sr0: READ TOC/PMA/ATIP length det failed
- I thought this was really at the root of it until I saw it was also in the successful run, oh well...

So that's it for the current data gathering session...

johnea
Comment by Jan de Groot (JGC) - Monday, 13 October 2008, 10:47 GMT
Is this still an issue with xorg-server 1.5.2 and xf86-video-intel 2.4.2?
Comment by android (android) - Monday, 13 October 2008, 18:05 GMT
I backed out of the [testing] repo after trying xorg-server 1.5.0 with no success.

I'll 'pacman -Syu' the [core] and [extra] components now and see how that goes.

Then I'll put [testing] back in the repo list to try the 1.5.2 xorg-server and see what I get from that.

I'll post results as soon as I have them.

johnea

P.S. On a further note: I'm now running the xrandr configuration with two monitors rotated 90degrees. I find playing video, in any format and with any player, along with this xrandr configuration to lead to immediate xorg server crash. This typically does not return to the text console from which the x session was started. The mouse, CTRL-ALT-BACKSPACE and CTRL-ALT-DELETE do not work. I have a serial terminal on this machine as well and I can still operate that command line after the crash, but I'm unable to recover the VGA console from which the startx was run and therefore have to reboot. At least with the serial terminal I can run shutdown instead of just powering off the machine.
Comment by android (android) - Monday, 13 October 2008, 19:26 GMT
Well, I just completed the -Syu with the existing [core] and [extra]. This didn't change the situation. The upgrade included a new cdparanoia, so I had hoped it might make a difference, but no.

I'm about to re-synch with the [testing] repo in the /etc/pacman.conf.

This, in and of itself, is somewhat problematic. After upgrading to the latest in [core] and [extra] a 'pacman -Su' with [testing] shows 110 packages that will be changed. If this leaves the system in a troubled condition, then I have to try to back-out all 110 of those package upgrades. This is no fun 8-(

More info soon...

johnea
Comment by android (android) - Monday, 13 October 2008, 19:39 GMT
OK, this is what I'll try first, upgrade only these three packages from [testing]:
xorg-server 1.5.2-1
intel-dri 7.2-1
mesa 7.2-1

I've also attached the listing of all 110 packages that are currently different from [core] and [extra]. This is in case some other package in this list could be affecting this situation.

This limited upgrade to [testing] will make downgrading easier once it's demonstrated that these upgrades don't make any difference.

Yet more soon...

johnea
Comment by android (android) - Monday, 13 October 2008, 20:49 GMT

All right, here's some more info.

NOTE: I know y'all don't really care about these details, but it seems that if I'm going to do the testing that the facts should be captured somewhere. This is that somewhere.

In my above mentioned upgrade to [testing] I left out one critical package xf86-video-intel. Somehow I overlooked that in the list. So I upgraded that package to [testing] as well, these four packages where upgraded to the [testing] repo for this test:
xorg-server 1.5.2-1
intel-dri 7.2-1
mesa 7.2-1
xf86-video-intel 2.4.2-1

I wonder about these packages, they are all from [core] or [extra]:
libxxf86dga 1.0.2-1
libxxf86misc 1.0.1-1
libxxf86vm 1.0.2-1
xf86-input-keyboard 1.3.1-1
xf86-input-mouse 1.3.0-1
xf86dgaproto 2.0.3-1
xf86miscproto 0.9.2-1
xf86vidmodeproto 2.2.2-1

SO with the four mentioned packages upgraded from the [testing] repo I again tried to load the unrippable "Jeff Beck - Wired" audio CD into k3b. this again caused the entire system to crash.

The entire system crashes, not just the X server. The attached serial terminal mentioned above also locks up, even at a root prompt. The computer must be manually powered off to reboot.

As long as I was enjoying the morning crashing the computer I thought I'd try a few variations.

First I turned off the XRandR rotation for the 2 monitors and operated in non-rotated dual monitor mode. This had no affect, the system still crashes.

Second I stopped using XRandR extensions all together (I commented them out in my .xinitrc) so that both monitors were in the default hardware "mirrored" mode. This also had no affect, the system still crashes.

Thirdly I disconnected the DVI monitor that is connected to the ADD2 PCI-express x16 card. This also yielded no change.

So with one monitor, no XRandR at all. Trying to rip this audio CD with k3b consistently crashes the whole computer requiring manual power off to reset.

I'm not an X-pert so I'm not totally clear on how the chipset driver fits in with the drivers for various features (dri, glx, etc) along with the core of the X server itself. But it seems that at this point X on the intel driver is just plain broken, at least on x86_64.

To paraphrase the irreverent jwz - "If an application can crash the X server, this is a bug in the X server by definition."

This bug goes even further, it crashes the entire system, not just X.

Where is unix's famous stability and reliability?

This is a clear example that even after 15 years, linux is still not ready for joe sixpack.

I've been on an archlinux workstation everyday for about the last 6 years. As an electrical engineer, with 25 years in industy, it's much more productive for me than being saddled with the constraints of windoze. My 9 year old son has been using it since he was 4, but I still couldn't recommend it to my mom, since she's not here in the house where I could support her.

I will try one more test today. I will boot into a WM other than my usual fluxbox. Maybe I'll look at this new KDE I just loaded and see if things are the same there.

Please post replies if there is some specific thing I could try. This seems like a serious bug from this end and I'm happy to do what I can to help resolve it.

hasta...

johnea
Comment by Jan de Groot (JGC) - Thursday, 06 November 2008, 22:05 GMT
Does your system also crash when using the vesa driver? Looking at your problems I have some suspicion that it's not even X related, but related to other problems. Is it possible that your system gets a complete lockup because K3b is trying to rip your (copyprotected?) CD and freaks out the whole ATA subsystem?
Comment by android (android) - Wednesday, 19 November 2008, 19:41 GMT
Hello Jan,

I finally had a chance to test with the vesa driver today (after clicking OK on a jpilot calendar reminder crashed the X session).

I loaded xf86-video-vesa and switched the xorg.conf driver to vesa. I also commented out all xrandr settings in .xinitrc and eliminated the Virtual setting in xorg.conf.

Started X in single monitor mode with vesa driver and started the CD rip.

The system crashes completely (not just X session) as before.

So it seems your suspicion is correct, this is not related to intel driver or xrandr, and probably not related to X at all.

Since my weeks old X session with 50-100 applications open is now crashed I'll also perform a pacman -Syu and repost results after this upgrade.

Thanks for your thoughts!

johnea
Comment by Jan de Groot (JGC) - Friday, 05 December 2008, 08:39 GMT
Can you try to find out why it crashes exactly? Copying the commands that K3b executes with a CD that doesn't crash your system should help out. After that, switch to a text console and execute the commands with the CD that hangs up your system and you'll see the kernel panics, oopses or whatever it does to your system.

I'm assuning logs won't get written during the crash here. If not, please attach logs.
Comment by Glenn Matthys (RedShift) - Wednesday, 24 December 2008, 08:38 GMT
Android: can you run memtest please?
Comment by android (android) - Wednesday, 24 December 2008, 22:08 GMT
I just started `memtester 1024` So far no failures.

I'll report more results after a couple of loops...

android
Comment by android (android) - Thursday, 25 December 2008, 03:50 GMT

memtester seems happy:

Loop 20:
Stuck Address : ok
Random Value : ok
Compare XOR : ok
Compare SUB : ok
Compare MUL : ok
Compare DIV : ok
Compare OR : ok
Compare AND : ok
Sequential Increment: ok
Solid Bits : ok
Block Sequential : ok
Checkerboard : ok
Bit Spread : ok
Bit Flip : ok
Walking Ones : ok
Walking Zeroes : ok

I've seen crashes using abcde without X at all. So I tend to thing Jan is on to something with: "Is it possible that your system gets a complete lockup because K3b is trying to rip your (copyprotected?) CD and freaks out the whole ATA subsystem?"



I'm sure this isn't the correct venue, but I'm continuing to experience a variety of X crashes on different systems with different video controllers, under different conditions.

There seems to be some fundamental instability in Xorg or perhaps in the arch rolling release system.

I've got xine routinely crashing a 32 bit intel i810 system whenever it's full screen and the menus are brought up by right clicking in the video window.

Also a VIA openchrome based system randomly crashing in gnome (that one I'm going to try the memtester on)

As far as this x86_64 with intel graphics system, I've basically stopped trying to rip my CD collection. Which isn't a great workaround 8-(

Thanks for everyone's effort. Over the holidays I may have a little more time to try things than normal. Please let me know if there are tests I could run.

johnea
Comment by Jan de Groot (JGC) - Thursday, 25 December 2008, 14:49 GMT
Note that the only serious memory tester is memtest86 and memtest86+, which are small memory testers that can test each and every memory location because they're not limited by an operating system.

Loading...