FS#22967 - Screen flickers with mesa, ati-dri, libgl, libdrm 7.10.0

Attached to Project: Arch Linux
Opened by Heiko Baums (cyberpatrol) - Saturday, 19 February 2011, 20:02 GMT
Last edited by Andreas Radke (AndyRTR) - Sunday, 01 May 2011, 18:38 GMT
Task Type Bug Report
Category Packages: Extra
Status Closed
Assigned To Tobias Powalowski (tpowa)
Jan de Groot (JGC)
Andreas Radke (AndyRTR)
Architecture All
Severity Medium
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 4
Private No

Details

Description:
Since kernel26 2.6.37-6 the screen is regularly flickering. Every 20-30 seconds the screen gets black for a millisecond or so (a short flicker).

This happens on X as well as on the console and with or without KMS. So it's not X but kernel related.

My video card is an ATI Radeon RV620 LE (Radeon HD 3450).

There are no related kernel parameters in the kernel line in /boot/grub/menu.lst.

Kernel26 2.6.37-5 and earlier are not affected.
This task depends upon

Closed by  Andreas Radke (AndyRTR)
Sunday, 01 May 2011, 18:38 GMT
Reason for closing:  Fixed
Comment by Dave Reisner (falconindy) - Saturday, 19 February 2011, 20:07 GMT
Could you try 2.6.37.1 from testing and see if this still persists?
Comment by Heiko Baums (cyberpatrol) - Saturday, 19 February 2011, 21:23 GMT
I could test it, but I made some more tests and it's most likely not kernel but X related.

I tried it with kernel26 2.6.37-5, kernel26 2.6.37-6 and the appropriate versions of kernel26-fbcondecor. My mistake was that I booted my kernel26-fbcondecor without KMS (nomodeset and vga) and kernel26 with KMS. I can't test it without KMS, because Xorg doesn't start anymore with KMS.

If I have xdm started but am not logged into X, the screen doesn't flicker, not either on the console. As soon as I'm logged into X the screen starts flickering in X as well as on the console. This happens with both kernel versions.

So I guess it's related to mesa, ati-dri, libgl and/or libdrm which have been updated to version 7.10.0.git20110215-1 at the same time when kernel26 was updated to 2.6.37-6.

Btw., it would be nice, if git versions wouldn't be moved into the stable repos. Only stable upstream releases belong into the stable repos. There are only a very few exceptions, e.g. if there's a new software of which there's no working stable version released. And git packages usually should be named as package-git.

I guess this bug actually belongs into the category "Packages: Extra".
Comment by Heiko Baums (cyberpatrol) - Sunday, 20 February 2011, 14:05 GMT
The issue still persists with kernel26 2.6.37.1 from [testing].
Like I said it's most likely at least one of the packages mesa, ati-dri, libgl and libdrm.

It would be nice if someone would change the subject to "Screen flickers with mesa, ati-dri, libgl, libdrm 7.10.0" and change the category to "Packages: Extra".
Comment by Alexandre Bique (babali) - Monday, 21 February 2011, 09:49 GMT
Hi, I can confirm the bug :

$ lspci
...
01:00.0 VGA compatible controller: ATI Technologies Inc RV620 LE [Radeon HD 3450]
...
$

I also tried to change the power management of the card :

[root@mordekaiser abique]# echo high > /sys/class/drm/card0/device/power_method
[root@mordekaiser abique]# echo $?
0
[root@mordekaiser abique]# cat /sys/class/drm/card0/device/power_method
profile
[root@mordekaiser abique]#


But as you can see, doesn't work.
Comment by Alexandre Bique (babali) - Monday, 21 February 2011, 09:53 GMT
BTW the severity should be high ?
Comment by Heiko Baums (cyberpatrol) - Monday, 21 February 2011, 12:39 GMT
This is at least very annoying and should be fixed soon.
Comment by Dan McGee (toofishes) - Friday, 25 February 2011, 00:36 GMT
I'm seeing this too, although it is only happening on one of my two monitors.
$ lspci | grep Rad
01:00.0 VGA compatible controller: ATI Technologies Inc RV770 [Radeon HD 4850]

Saying "very annoying and should be fixed soon" isn't going to magically get an Arch dev to look at this. You will need to do some research upstream and see if you can at least fine the last working version, etc.
Comment by Heiko Baums (cyberpatrol) - Friday, 25 February 2011, 01:02 GMT
Like I said before, maybe it would help, only putting stable upstream releases to the stable repos instead of unstable git versions. Would be worth a try. If this issue still persists with the latest stable upstream version one could probably file an upstream bug report.
Comment by Dan McGee (toofishes) - Monday, 28 February 2011, 19:19 GMT
Just rebuilt with the changes in commit 110053 mostly reverted: running 7.10.0 from upstream, removed --enable-gallium-r600 and --enable-gallium-swrast, and installed the old r600 rather than the r600 gallium driver. Unfortunately I still have the screen flicker and now I'm not sure where to look.

I did notice that it didn't seem to start until I logged in and started Firefox; perhaps triggered by loading OpenGL-related stuff? I didn't see any flicker at all in my login manager (slim).
Comment by Andreas Radke (AndyRTR) - Monday, 28 February 2011, 19:33 GMT
You can try a .38 rc kernel to check for a kernel drm bug and try a xf86-video-ati-git snapshot to check if it's fixed in the ddx driver upstream already.

For Mesa: Usually mesa devs pick carefully only really helpful patches to their stable branch. So it's usually in a much better state than the last stable release.
Comment by Balló György (City-busz) - Tuesday, 08 March 2011, 11:40 GMT
I have a similar bug with nouveau driver. The screen randomly blacks out for less than a second on continuous CPU usage. This was happened also previously, but recently it happens more frequently.
My card:
01:00.0 VGA compatible controller: nVidia Corporation NV17 [GeForce4 MX 440] (rev a3)
Comment by Balló György (City-busz) - Thursday, 10 March 2011, 03:17 GMT
It seems that this bug was reported in Ubuntu also:
https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-ati/+bug/727979

And I opened a new bug for the nouveau related problem:
https://bugs.archlinux.org/task/23213
Comment by Balló György (City-busz) - Thursday, 10 March 2011, 15:51 GMT
@cyberpatrol: what is your error message in everything.log on each flicker?

I found a possible solution: http://lists.freedesktop.org/archives/dri-devel/2011-March/009057.html
Comment by Heiko Baums (cyberpatrol) - Friday, 11 March 2011, 01:32 GMT
I've indeed got the same error messages in kernel.log as in the e-mail which György has linked to.
Comment by Balló György (City-busz) - Saturday, 12 March 2011, 11:47 GMT
Now I using a kernel with the above mentioned patch, and flickers are disappeared! I'm using nouveau driver, but I think that it should also solve the problem with ati driver. It seems that getting EDID sometimes fails, and this cause the problem.
Comment by Heiko Baums (cyberpatrol) - Saturday, 12 March 2011, 19:09 GMT
So I guess the Category should be changed to "Packages: Core" again, the subject should be changed to "Screen flickers with kernel26 >=2.6.37-6" again and should be assigned to the kernel devs.
Comment by Jelle van der Waa (jelly) - Wednesday, 30 March 2011, 16:05 GMT
i have screen flickers too with nvidia 270.30, so i downgraded nvidia and will check if i get them again.
01:00.0 VGA compatible controller: nVidia Corporation G84M [Quadro FX 570M] (rev a1)

i get Xiv errors when the screen flickers
Comment by Andreas Radke (AndyRTR) - Thursday, 31 March 2011, 16:35 GMT
Please locate if only certain kernel(-drm) cause this bug. Test with older kernels, .37.x in core and .38.x in testing. There's also drm-next and -testing in AUR you may want to try.
Comment by Heiko Baums (cyberpatrol) - Thursday, 31 March 2011, 17:58 GMT
The screen still flickers with kernel26 2.6.38.2-1 from [testing]. The bug only occurs since kernel26 2.6.37-6. Kernel26 2.6.37-5 and earlier were not affected. But I can't downgrade the kernel to those earlier versions to test if this is related to an update of the kernel or xorg because I haven't got the packages anymore in my package cache.

There's no package drm-next or drm-testing in AUR, btw., only an outdated package kernel26-drm-next.
Comment by Heiko Baums (cyberpatrol) - Thursday, 31 March 2011, 18:09 GMT
Updating the system completely to [testing] also doesn't fix this issue.
Comment by Andreas Radke (AndyRTR) - Thursday, 31 March 2011, 20:20 GMT
http://git.kernel.org/?p=linux%2Fkernel%2Fgit%2Fstable%2Flinux-2.6.37.y.git&a=search&h=HEAD&st=commit&s=drm

go on and test which of these late drm "fixes" introduced the regression.
Comment by Heiko Baums (cyberpatrol) - Thursday, 31 March 2011, 20:36 GMT
You don't want me to recompile the kernel more than 100 times and remove every single patch? What if you would read the mailing list thread on freedesktop.org to which György has linked? I don't know if this fixes this issue, but I get the same error messages as I've written before.
Comment by Dan McGee (toofishes) - Thursday, 31 March 2011, 21:08 GMT
And what- do you expect Andy to do the work for you? Stop whining and do something about it if this bug affects you and you take offense to someone not doing work for you to fix it. man git-bisect.

For the record, I don't get any useful messages at all in dmesg output when my monitor flickers, and it is only my second monitor. I haven't had time to sit down and try bisecting this yet, otherwise I'd be able to lend more helpful information which is the primary reason I haven't commented in a while- nothing new to add.
Comment by Heiko Baums (cyberpatrol) - Thursday, 31 March 2011, 21:46 GMT
I don't expect anything, but reading this thread on the mailing list. Then Andy and you would know that there's a patch which shall fix this issue. I haven't tested it, yet, but I get the same error messages as the original poster on this mailing list. And, btw., I'm not whining. If I could give more useful information I would do it.

And another thing. Andy asked me to do test something. I don't think, that doing this and tell him the results I get is whining and not helpful.
Comment by Andreas Radke (AndyRTR) - Friday, 01 April 2011, 20:06 GMT
Calm down here. You should know our patching rules. We won't implement any untested patch into our kernel until you prove it to be a valid and safe fix. I'm not reading all stuff about each hardware related issue I can't reproduce myself. I've pointed you to the latest handful of upstream patches that changed the kernel drm module.

You should know how this goes in the OpenSource land. You are affected and only you can give useful feedback. If nobody does it it won't get fixed. That's it. Nothing less and nothing more. So feel free to do the testing and report us the final soultion you will find together with upstream devs. Or it will stay broken.
Comment by Barry Jackson (barjac) - Friday, 01 April 2011, 22:59 GMT
I am seeing this exact same issue in Mageia Alpha using 2.6.38.2-1.mga (linus) with nouveau.

[root@localhost baz]# tailf /var/log/kernel/info.log
Apr 1 23:44:12 localhost kernel: [drm] nouveau 0000:01:00.0: Load detected on output B
Apr 1 23:44:42 localhost kernel: [drm] nouveau 0000:01:00.0: Load detected on output B
Apr 1 23:45:12 localhost kernel: [drm] nouveau 0000:01:00.0: Load detected on output B

These continue at exact 30 sec intervals (in sync with the flicker) while there is a TV connected to the SVHS out on my FX5600u card.

With the TV disconnected the messages stop, but the flicker continues at the same rate.

I have reported this in the Mageia bug system today but as yet there is no response https://bugs.mageia.org/show_bug.cgi?id=610 - I will also cross link to this bug from there.

I hope this may help someone with more knowledge that me to pin this down.
Comment by Dan McGee (toofishes) - Saturday, 23 April 2011, 18:06 GMT
Damn it. I was all prepared to bisect my flicker issue today, and now I can't see it flickering anywhere! I have no idea why it disappeared, other than the fact that maybe it was never a kernel problem for me. Tested the following:
* Stock Arch kernel, 2.6.38.4-1
* make localmodconfig, 2.6.37
* make localmodconfig, 2.6.37.6

So my job is done here, apparently.
Comment by Alexandre Bique (babali) - Saturday, 23 April 2011, 18:31 GMT
I still have the bug at work.
Comment by Heiko Baums (cyberpatrol) - Saturday, 23 April 2011, 18:31 GMT
I still have the flickering with stock Arch kernel 2.6.38.4-1. But what's make localmodconfig?
Comment by Wei-Ning Huang (aitjcize) - Sunday, 01 May 2011, 08:17 GMT
I still have the bug with:
kernel 2.6.38.4
mesa/libgl 7.10.2

but if I downgrade mesa to 7.10.0, the problem is gone.
Comment by Heiko Baums (cyberpatrol) - Sunday, 01 May 2011, 16:58 GMT
With kernel26 2.6.38.4 and mesa, libgl and ati-dri from [testing] the bug is fixed.
Comment by Balló György (City-busz) - Sunday, 01 May 2011, 17:32 GMT
It seems that here is the upstream bug report: https://bugs.freedesktop.org/show_bug.cgi?id=36005
And it's probably fixed in upower 0.9.9, but currently I can't test them.
Comment by Heiko Baums (cyberpatrol) - Sunday, 01 May 2011, 18:28 GMT
@György: It seems you are right. upower 0.9.9 is already in [extra] since yesterday. I first upgraded to [testing] today which, of course included the upower update from [extra], and the screen flickering disappeared. Then I downgraded from [testing] to [core]/[extra] again and the screen still doesn't flicker.

So I guess this bug can be closed as fixed.

Loading...