FS#29257 - nxagent crashes when running firefox/thunderbird with cairo 1.12.0-3

Attached to Project: Arch Linux
Opened by Maciej Sitarz (macieks2) - Tuesday, 03 April 2012, 18:16 GMT
Last edited by Andreas Radke (AndyRTR) - Saturday, 12 May 2012, 12:46 GMT
Task Type Bug Report
Category Packages: Testing
Status Closed
Assigned To Tobias Powalowski (tpowa)
Andreas Radke (AndyRTR)
Architecture x86_64
Severity High
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 0%
Votes 1
Private No

Details

Description:
Established nxclient(NoMachine) connection crashes when firefox or thunderbird is run in the session.

After downgrading cairo to 1.10.2-3 the issue can't be reproduced.


Additional info:
* package version(s)
nxserver 3.5.0-5
nx-common 3.5.0-4
nxclient 3.5.0.7-1
cairo 1.12.0-2
firefox 11.0-3
thunderbird 11.0.1-1

* config and/or log files etc.
System logs:
Apr 03 19:54:52 HOST sshd[20073]: Accepted publickey for nx from 127.0.0.1 port 37582 ssh2
Apr 03 19:54:52 HOST sshd[20073]: pam_unix_session(sshd:session): session opened for user nx by (uid=0)
Apr 03 19:54:52 HOST console-kit-daemon[464]: missing action
Apr 03 19:54:52 HOST systemd-logind[460]: New session c30 of user nx.
Apr 03 19:54:53 HOST sshd[20209]: Accepted password for USER from 127.0.0.1 port 37583 ssh2
Apr 03 19:54:53 HOST sshd[20209]: pam_unix_session(sshd:session): session opened for user USER by (uid=0)
Apr 03 19:54:53 HOST console-kit-daemon[464]: missing action
Apr 03 19:54:54 HOST systemd-logind[460]: New session c31 of user USER.
Apr 03 19:56:18 HOST kernel: nxagent[20730]: segfault at 0 ip 00000000004a7f48 sp 00007fff603f64d0 error 4 in nxagent[400000+4d1000]


nx logs:
Setting default value
Failed to read: session.screen0.strftimeFormat
Setting default value
Failed to read: session.screen0.titlebar.left
Setting default value
Failed to read: session.screen0.titlebar.right
Setting default value
XIO: fatal IO error 11 (Resource temporarily unavailable) on X server ":1005"
after 272679 requests (272679 known processed) with 0 events remaining.
xterm: fatal IO error 11 (Resource temporarily unavailable) or KillClient on X server ":1005.0"



Steps to reproduce:
1. Connect to nxserver using nxclient (start fluxbox WM for best performance and low error output in logs)
2. Check if the session is working fine, start xterm etc.
3. Run firefox or thunderbird

After these steps above logs will apear and nxagent will crash.


As this defect showed up just after cairo upgrade I report it here, maybe I should report it upstream to NoMachine?
This task depends upon

Closed by  Andreas Radke (AndyRTR)
Saturday, 12 May 2012, 12:46 GMT
Reason for closing:  Fixed
Additional comments about closing:  fixed in new upstream nx-libs release 3.5.0.13
Comment by alex (kabolt) - Tuesday, 03 April 2012, 18:55 GMT
Is it really firefox 11.0-3 too?
Firefox 11.0-3 doesn't use the buggy cairo version and should work.
Comment by Maciej Sitarz (macieks2) - Tuesday, 03 April 2012, 20:00 GMT
Yes, I'm sure I have firefox 11.0-3:
$ yaourt -Qs firefox
testing/firefox 11.0-3
Standalone web browser from mozilla.org

I even did a test...
1. Downgraded cairo to 1.10.2-3
2. Started nxclient
3. Started firefox in NX session
4. Firefox worked OK, closed firefox
5. Upgraded cairo to 1.12.0-2
6. Started firefox in NX session and it NX session crashed


BTW
The topic should state "cairo 1.12.0-2" as there's no "1.12.0-3"
Comment by Maciej Sitarz (macieks2) - Tuesday, 03 April 2012, 20:05 GMT
One more test I did.

Forced remove cairo (pacman -Rdd cairo) to check how firefox will act and:
$ firefox
XPCOMGlueLoad error for file /usr/lib/firefox/libxpcom.so:
libcairo.so.2: cannot open shared object file: No such file or directory
Couldn't load XPCOM.

So now we know which libs loads cairo:
$ ldd /usr/lib/firefox/libxpcom.so |grep cairo
libpangocairo-1.0.so.0 => /usr/lib/libpangocairo-1.0.so.0 (0x00007f6887f87000)
libcairo.so.2 => /usr/lib/libcairo.so.2 (0x00007f6887a47000)
Comment by Andreas Radke (AndyRTR) - Wednesday, 04 April 2012, 18:30 GMT
NXagent segfaults immediatly here when opening a remote connection when cairo 1.12.0 is installed on the server running a Xfce session. So this segfault is independent from Firefox.

I've already reported this to the X2go and FreeNX mailing lists. Debian user can reproduce it seems.
Comment by Jim (eris0xff) - Tuesday, 08 May 2012, 00:21 GMT
I'm a debian (aptosid) user, but I thought I'd report my investigation into this issue here. I have the same crash issues with libcairo (same version).

I recompiled the x2goagent (actually nxagent) package with debug symbols. It downloads all of the nx-libs which are required to re-create nxagent. From the size and recompile time it looks like it builds an entire X nest implementation.

Anyway I started an X session using x2go. I normally run KDE, but I do need Firefox, emacs and some other programs which depend on gtk etc.

After I logged in I started gdb in a separate remote SSH session and attached to x2goagent.

The process seg faults in the nxagentTrapezoids function on line 1752 of the nx-libs-3.5.0.12/nx-X11/hw/nxagent/Render.c file.

"if (nxagentDrawableStatus(pSrc -> pDrawable) == NotSynchronize)"

'p' here means Picture. The backtrace indicates that this is the result of new or bogus render op.

pSrc is defined, pSrc.id seems valid, but pDrawable == 0 and that's why the segfault.

I don't understand enough about X to determine the cause of this, but if I had to make an educated guess I'd say that the newest libcairo defines some sort of picture which is (for some reason) not drawable or "pDrawable" doesn't make sense for it.

In any case, here's the rest of the backtrace -- basically just the callstack for a render dispatch.

(gdb) bt
#0 0x00000000004ac268 in nxagentTrapezoids (op=3 '\003', pSrc=0x40c1770, pDst=0x40c1970, maskFormat=<optimized out>, xSrc=1680, ySrc=31, ntrap=1,
traps=0x304aed0) at Render.c:1752
#1 0x000000000043790a in ProcRenderTrapezoids (client=0x3328ee0) at X/NXrender.c:1123
#2 0x000000000043131d in ProcRenderDispatch (client=<optimized out>) at X/NXrender.c:2519
#3 0x00000000004305d6 in Dispatch () at X/NXdispatch.c:747
#4 0x000000000040fea5 in main (argc=11, argv=0x7fff7a1a8168, envp=<optimized out>) at main.c:450

I suppose that the most fruitful investigation of this would be to hunt down the render operation being called and either skip it or prevent it from being executed.





I examined some of the rendering variables. Haven't had a lot of time to examine the issue, but it appears that
Comment by Jim (eris0xff) - Wednesday, 09 May 2012, 15:59 GMT
I'm a debian (aptosid) user, but I thought I'd report my investigation into this issue here. I have the same crash issues with libcairo (same version).

I recompiled the x2goagent (actually nxagent) package with debug symbols. It downloads all of the nx-libs which are required to re-create nxagent. From the size and recompile time it looks like it builds an entire X nest implementation.

Anyway I started an X session using x2go. I normally run KDE, but I do need Firefox, emacs and some other programs which depend on gtk etc.

After I logged in I started gdb in a separate remote SSH session and attached to x2goagent.

The process seg faults in the nxagentTrapezoids function on line 1752 of the nx-libs-3.5.0.12/nx-X11/hw/nxagent/Render.c file.

"if (nxagentDrawableStatus(pSrc -> pDrawable) == NotSynchronize)"

'p' here means Picture. The backtrace indicates that this is the result of new or bogus render op.

pSrc is defined, pSrc.id seems valid, but pDrawable == 0 and that's why the segfault.

I don't understand enough about X to determine the cause of this, but if I had to make an educated guess I'd say that the newest libcairo defines some sort of picture which is (for some reason) not drawable or "pDrawable" doesn't make sense for it.

In any case, here's the rest of the backtrace -- basically just the callstack for a render dispatch.

(gdb) bt
#0 0x00000000004ac268 in nxagentTrapezoids (op=3 '\003', pSrc=0x40c1770, pDst=0x40c1970, maskFormat=<optimized out>, xSrc=1680, ySrc=31, ntrap=1,
traps=0x304aed0) at Render.c:1752
#1 0x000000000043790a in ProcRenderTrapezoids (client=0x3328ee0) at X/NXrender.c:1123
#2 0x000000000043131d in ProcRenderDispatch (client=<optimized out>) at X/NXrender.c:2519
#3 0x00000000004305d6 in Dispatch () at X/NXdispatch.c:747
#4 0x000000000040fea5 in main (argc=11, argv=0x7fff7a1a8168, envp=<optimized out>) at main.c:450

I suppose that the most fruitful investigation of this would be to hunt down the render operation being called and either skip it or prevent it from being executed.





I examined some of the rendering variables. Haven't had a lot of time to examine the issue, but it appears that
Comment by Jim (eris0xff) - Wednesday, 09 May 2012, 16:04 GMT
I've performed Google searches on "pSrc -> pDrawable" and found that this bug crops up in nxagent every year or so. Many references throughout nxagent code to pSrc->pDrawable never check to make sure that pDrawable is a valid pointer. Whether because of picture structures sent in error or because some spec has been updated / changed or because of rarely used edge-cases, a pSrc with null drawable field gets sent. No matter what the case, good programming practice should always check pointer structures to make sure they are intact.

I'm currently testing nxagent in debug mode with some patches to (hopefully intelligently) skip certain operations if the picture source has a null drawable. As soon as I get a bug-free build I'll post the patches here.
Comment by Jim (eris0xff) - Wednesday, 09 May 2012, 21:11 GMT
Still fixing segfaults in Render.c. These issues crop up all over the Render.c code and also in some of the Pixel.h decision macros. The previous committer tried to remove them, but problem still exists. Modifying some of the macro code to ease debugging.

Comment by Jim (eris0xff) - Thursday, 10 May 2012, 00:02 GMT
Still testing and debugging. The new libcairo must really exercise some newly defined renderops. Unless there are major upstream bugs in the new libcairo, render ops now allow both source images and image masks to be either completely null or null drawables. Kind of a mess to handle.
Comment by Jim (eris0xff) - Thursday, 10 May 2012, 15:43 GMT
Still testing and debugging. The new libcairo must really exercise some newly defined renderops. Unless there are major upstream bugs in the new libcairo, render ops now allow both source images and image masks to be either completely null or null drawables. Kind of a mess to handle.
Comment by Jim (eris0xff) - Thursday, 10 May 2012, 16:32 GMT
Looks like my patches for nxagent work. I'll be distributing them as soon as I can find the official open source nx authority. Would it be x2go or xorg?
Comment by Jim (eris0xff) - Thursday, 10 May 2012, 16:33 GMT
If you need them before they get posted around, just let me know and I'll email them to you or post them here.
Comment by Jim (eris0xff) - Thursday, 10 May 2012, 19:50 GMT
Here's the patch against the latest version:

*** x2go/nx-libs-3.5.0.12/nx-X11/programs/Xserver/hw/nxagent/Render.c 2012-03-07 14:04:02.000000000 -0700
--- x2go-new/nx-libs-3.5.0.12/nx-X11/programs/Xserver/hw/nxagent/Render.c 2012-05-10 11:09:39.631786853 -0600
***************
*** 995,1000 ****
--- 995,1030 ----
#endif
}

+
+ int nxagentShouldDeferComposite(PicturePtr pSrc, PicturePtr pMask, PicturePtr pDst)
+ {
+
+ int drawableDst;
+ int linkDeferred;
+ int unSyncedSrcMask;
+
+ drawableDst = ( nxagentRenderVersionMajor == 0 &&
+ nxagentRenderVersionMinor == 8 &&
+ (pDst) -> pDrawable -> type == DRAWABLE_PIXMAP
+ );
+
+ linkDeferred = ( nxagentOption(DeferLevel) >= 2 &&
+ nxagentOption(LinkType) < LINK_TYPE_ADSL
+ );
+
+ unSyncedSrcMask = ( nxagentOption(DeferLevel) == 1 &&
+ (pDst) -> pDrawable -> type == DRAWABLE_PIXMAP &&
+ (
+ (pSrc -> pDrawable && (nxagentDrawableStatus(pSrc -> pDrawable) == NotSynchronized)) ||
+ ((pMask) && pMask -> pDrawable && (nxagentDrawableStatus((pMask) -> pDrawable) == NotSynchronized))
+ )
+ );
+
+
+ return drawableDst || linkDeferred || unSyncedSrcMask;
+ }
+
+
void nxagentComposite(CARD8 op, PicturePtr pSrc, PicturePtr pMask, PicturePtr pDst,
INT16 xSrc, INT16 ySrc, INT16 xMask, INT16 yMask, INT16 xDst,
INT16 yDst, CARD16 width, CARD16 height)
***************
*** 1036,1043 ****
}

#endif
!
! if (NXAGENT_SHOULD_DEFER_COMPOSITE(pSrc, pMask, pDst))
{
pDstRegion = nxagentCreateRegion(pDst -> pDrawable, NULL, xDst, yDst, width, height);

--- 1066,1073 ----
}

#endif
! /* if (NXAGENT_SHOULD_DEFER_COMPOSITE(pSrc, pMask, pDst)) */
! if (nxagentShouldDeferComposite(pSrc, pMask, pDst))
{
pDstRegion = nxagentCreateRegion(pDst -> pDrawable, NULL, xDst, yDst, width, height);

***************
*** 1095,1101 ****
}
}

! if (pMask != NULL && pMask -> pDrawable != pSrc -> pDrawable &&
pMask -> pDrawable != pDst -> pDrawable)
{
nxagentSynchronizeShmPixmap(pMask -> pDrawable, xMask, yMask, width, height);
--- 1125,1132 ----
}
}

! if ((pMask) && (pMask->pDrawable) &&
! pMask -> pDrawable != pSrc -> pDrawable &&
pMask -> pDrawable != pDst -> pDrawable)
{
nxagentSynchronizeShmPixmap(pMask -> pDrawable, xMask, yMask, width, height);
***************
*** 1259,1265 ****
* on the real X server.
*/

! if (nxagentDrawableStatus(pSrc -> pDrawable) == NotSynchronized)
{
#ifdef TEST
fprintf(stderr, "nxagentGlyphs: Synchronizing source [%s] at [%p].\n",
--- 1290,1296 ----
* on the real X server.
*/

! if (pSrc -> pDrawable && (nxagentDrawableStatus(pSrc -> pDrawable) == NotSynchronized))
{
#ifdef TEST
fprintf(stderr, "nxagentGlyphs: Synchronizing source [%s] at [%p].\n",
***************
*** 1302,1315 ****
nxagentSynchronizeBox(pSrc -> pDrawable, &glyphBox, NEVER_BREAK);
}

! if (pSrc -> pDrawable -> type == DRAWABLE_PIXMAP)
{
nxagentIncreasePixmapUsageCounter((PixmapPtr) pSrc -> pDrawable);
}
}

! if (pSrc -> pDrawable != pDst -> pDrawable &&
! nxagentDrawableStatus(pDst -> pDrawable) == NotSynchronized)
{
#ifdef TEST
fprintf(stderr, "nxagentGlyphs: Synchronizing destination [%s] at [%p].\n",
--- 1333,1347 ----
nxagentSynchronizeBox(pSrc -> pDrawable, &glyphBox, NEVER_BREAK);
}

! if (pSrc -> pDrawable && (pSrc -> pDrawable -> type == DRAWABLE_PIXMAP))
{
nxagentIncreasePixmapUsageCounter((PixmapPtr) pSrc -> pDrawable);
}
}

!
! if (pSrc -> pDrawable && (pSrc -> pDrawable != pDst -> pDrawable &&
! nxagentDrawableStatus(pDst -> pDrawable) == NotSynchronized))
{
#ifdef TEST
fprintf(stderr, "nxagentGlyphs: Synchronizing destination [%s] at [%p].\n",
***************
*** 1749,1755 ****
return;
}

! if (nxagentDrawableStatus(pSrc -> pDrawable) == NotSynchronized)
{
#ifdef TEST
fprintf(stderr, "nxagentTrapezoids: Going to synchronize the source drawable at [%p].\n",
--- 1781,1789 ----
return;
}

! /* the following blocks need fixing to ignore null values of pDrawable */
!
! if (pSrc -> pDrawable && (nxagentDrawableStatus(pSrc -> pDrawable) == NotSynchronized))
{
#ifdef TEST
fprintf(stderr, "nxagentTrapezoids: Going to synchronize the source drawable at [%p].\n",
***************
*** 1843,1849 ****
* operation like nxagentTrapezoids() does.
*/

! if (nxagentDrawableStatus(pSrc -> pDrawable) == NotSynchronized)
{
#ifdef TEST
fprintf(stderr, "nxagentTriangles: Going to synchronize the source drawable at [%p].\n",
--- 1877,1885 ----
* operation like nxagentTrapezoids() does.
*/

!
!
! if (pSrc -> pDrawable && (nxagentDrawableStatus(pSrc -> pDrawable) == NotSynchronized))
{
#ifdef TEST
fprintf(stderr, "nxagentTriangles: Going to synchronize the source drawable at [%p].\n",
***************
*** 1920,1926 ****
* operation like nxagentTrapezoids() does.
*/

! if (nxagentDrawableStatus(pSrc -> pDrawable) == NotSynchronized)
{
#ifdef TEST
fprintf(stderr, "nxagentTriStrip: Going to synchronize the source drawable at [%p].\n",
--- 1956,1963 ----
* operation like nxagentTrapezoids() does.
*/

!
! if (pSrc -> pDrawable && (nxagentDrawableStatus(pSrc -> pDrawable) == NotSynchronized))
{
#ifdef TEST
fprintf(stderr, "nxagentTriStrip: Going to synchronize the source drawable at [%p].\n",
***************
*** 1997,2003 ****
* operation like nxagentTrapezoids() does.
*/

! if (nxagentDrawableStatus(pSrc -> pDrawable) == NotSynchronized)
{
#ifdef TEST
fprintf(stderr, "nxagentTriFan: Going to synchronize the source drawable at [%p].\n",
--- 2034,2041 ----
* operation like nxagentTrapezoids() does.
*/

!
! if (pSrc -> pDrawable && (nxagentDrawableStatus(pSrc -> pDrawable) == NotSynchronized))
{
#ifdef TEST
fprintf(stderr, "nxagentTriFan: Going to synchronize the source drawable at [%p].\n",


Comment by Maciej Sitarz (macieks2) - Saturday, 12 May 2012, 09:00 GMT
Please attache the patch as a file, not directly in the comment.

Loading...