FS#55720 - [texlive-bin] [poppler] heap corruption with -3

Attached to Project: Arch Linux
Opened by Jon Gjengset (Jonhoo) - Saturday, 23 September 2017, 19:28 GMT
Last edited by Rémy Oudompheng (remyoudompheng) - Wednesday, 27 September 2017, 06:11 GMT
Task Type Bug Report
Category Packages: Extra
Status Closed
Assigned To Jan de Groot (JGC)
Andreas Radke (AndyRTR)
Rémy Oudompheng (remyoudompheng)
Architecture All
Severity Medium
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 15
Private No

Details

Description:

Since upgrading to 2017.44590-3, I've been experiencing pdflatex hangs and crashes across several machines. The error all seem to be related to heap corruption. Examples include:

- *** Error in `pdflatex': corrupted size vs. prev_size: 0x000055ac5cd8a82f ***
- “pdflatex paper.tex” terminated by signal SIGSEGV (Address boundary error)
- pdflatex _pthread_mutex_lock assertion 'INTERNAL_SYSCALL_ERRNO (e, __err) != ESRCH || !robust' failed

Downgrading to -2 (and downgrading poppler to 0.57) fixes the problem.

The TeX files I'm operating on unfortunately aren't publicly available, but I thought I'd flag it anyway.
This task depends upon

Closed by  Rémy Oudompheng (remyoudompheng)
Wednesday, 27 September 2017, 06:11 GMT
Reason for closing:  Fixed
Additional comments about closing:  in texlive-bin 2017.44590-5
Comment by Tamaskan (Tamaskan) - Saturday, 23 September 2017, 19:54 GMT
Same problem here, pdflatex segfaults, older version works.
Comment by M (mjr) - Sunday, 24 September 2017, 00:19 GMT
I'm also running into this bug. In a document I'm currently working on (also not public - sorry) the bug is triggered by inclusion of a certain pdf using \includepdf from the pdfpages package. I've no idea what is special about this pdf file (an article produced by a reputable academic journal) but including other pdfs appears to still work fine.
Comment by P. Siegl (kanal108) - Sunday, 24 September 2017, 00:41 GMT
I have problems with version 2017.44590-3 as well, meaning that it also bails out when I include pdfs:
\includegraphics[page=1,trim={2.5cm 0.7cm 2.5cm 1.4cm},clip,width=0.85\textwidth]{../pics/tmp.pdf}

[6 <../pics/tmp.pdf, page is rotated 90 degrees Internal Error (0): Call to Object where the object was type 5, not the expected type 8
Aborted (core dumped)

I can confirm that a rollback to poppler 0.57 and texlive-bin 2017.44590-2 fixes the issue.
Comment by Jon Gjengset (Jonhoo) - Sunday, 24 September 2017, 00:45 GMT
I don't know exactly how Arch uses severity and priority in the bug tracker, but given the scope of this problem, I'd suggest that one or both is raised.
Comment by Eli Schwartz (eschwartz) - Sunday, 24 September 2017, 18:14 GMT
https://wiki.archlinux.org/index.php/Reporting_bug_guidelines#Severity :)

I daresay \includepdf is neither "The main functionality" nor "System crash or severe boot failure that is likely to affect more than just you".
Comment by Jon Gjengset (Jonhoo) - Sunday, 24 September 2017, 18:16 GMT
@eschwartz: The crash is not only for \includepdf; in my case the error seems to be triggered by \includegraphics. I would argue that, taken together, these are present in a large number of LaTeX projects built with pdflatex, and thus the main functionality of pdflatex is broken for a large set of users.
Comment by Doug Newgard (Scimmia) - Sunday, 24 September 2017, 18:25 GMT
And pdflatex isn't the "main functionality" of the package.
Comment by Jon Gjengset (Jonhoo) - Sunday, 24 September 2017, 18:30 GMT
@Scimmia: That may be true, but it is still what many consumers of this package use it for. Given that the (relatively large) patch applied to this package introduces a segfault in pdflatex, it is also not unreasonable to assume that other binaries may also be affected. All that said, I don't think it's particularly important for the level to be raised, as this will likely be fixed regardless, so I will retract my suggestion that they be to avoid cluttering the comments :)
Comment by Rémy Oudompheng (remyoudompheng) - Sunday, 24 September 2017, 21:04 GMT
Please attach a small TeX document and PDF file to include in order to reproduce the issue.
Thanks.
Comment by Tamaskan (Tamaskan) - Sunday, 24 September 2017, 21:46 GMT
Here you go. PDF image was created using Microsoft Visio 2016.
Comment by Matthew Lawson (the_vorpal_blade) - Sunday, 24 September 2017, 22:03 GMT
There's a little extra info in this duplicate:
https://bugs.archlinux.org/task/55721

Specifically, in PDFs that included text, those where the text was in the form of embedded fonts triggered the crash whereas those in which the fonts had been converted to paths did not.
Comment by Eli Schwartz (eschwartz) - Sunday, 24 September 2017, 22:30 GMT
The duplicate  FS#55721  happened to be very well written FWIW :)

@remy, it came with a reproducer as well as a poppler bugreport https://bugs.freedesktop.org/show_bug.cgi?id=102952
Comment by Larry (lzlarryli) - Monday, 25 September 2017, 04:54 GMT
I have the same problem. I am not sure if this is a poppler bug because I fixed this by install texlive from https://www.tug.org/texlive/.

This bug is also reported here:
https://bugs.freedesktop.org/show_bug.cgi?id=102952
Comment by Dan Koschier (Deian) - Monday, 25 September 2017, 07:26 GMT
Can confirm. When including PDFs as images I get the error:
"Internal Error (0): Call to Object where the object was type 5, not the expected type 8".
Downgrading to "texlive-bin-2017.44590-2-x86_64.pkg.tar.xz" (and the according libpoppler packages) fixed it for now.
Comment by Rémy Oudompheng (remyoudompheng) - Monday, 25 September 2017, 10:33 GMT
Can you also attach the document which causes the heap corruption reported initially ?
Thanks.
Comment by Jon Gjengset (Jonhoo) - Monday, 25 September 2017, 13:17 GMT
The exact file length seems to matter, as it changes the crash condition, presumably because it overwrites memory with other values.
This particular combination gives me `*** Error in `pdflatex': corrupted size vs. prev_size: 0x0000564d0a1454cf ***`
Comment by Rémy Oudompheng (remyoudompheng) - Monday, 25 September 2017, 19:23 GMT
I cannot reproduce the heap corruption with texlive-bin 2017.44590-3 and poppler 0.59.0-1.
Comment by Jon Gjengset (Jonhoo) - Monday, 25 September 2017, 21:06 GMT
@remyoudompheng: because it's a heap corruption, the exact memory layout changes the outcome. In some cases it may even run fine without crashing, corrupting some random piece of memory that doesn't cause problems. Try adding a few lines of text, reordering the inputs, or other things that may nudge the memory layout, and that should trigger it.
Comment by Rémy Oudompheng (remyoudompheng) - Monday, 25 September 2017, 21:12 GMT
I cannot debug a heap corruption that I cannot reproduce. If you know a document that triggers the heap corruption, please provide it.

PS: don't flag as duplicates bugs that are not the same issue.
Comment by Rémy Oudompheng (remyoudompheng) - Monday, 25 September 2017, 21:17 GMT
Please note that package version 2017.44590-4 fixes the *different* issue reported as  FS#55721 .
Please only comment here about the original poster's issue, which is the heap corruption.
Comment by Christos Tzelepis (nullgeppetto) - Monday, 25 September 2017, 21:21 GMT
@remyoudompheng: My Apologies, I saw the error above and thought it's relevant. I remove my comments. Thanks for letting me know.
Edit: Well, it seems that I cannot remove them (only edit). If you can delete them, please do it :) Sorry about this.
Comment by Jon Gjengset (Jonhoo) - Monday, 25 September 2017, 21:23 GMT
remyoudompheng: the documents I attached reproduce the problem on two of my Arch Linux machines with 2017.44590-3 and poppler 0.59.0-1. But again, the exact memory layout of the program at runtime matters, so you may need to add a few letters into the document to trigger the crash on your machine.
Comment by Rémy Oudompheng (remyoudompheng) - Tuesday, 26 September 2017, 04:33 GMT
Memory layout does not depend on machine, however it may be randomized, so I will need to have very precise answers:
- are you using architecture x86-64?
- do you reproduce the issue with the files paper.tex and figures_design-backfill-query.pdf, figures_design-ancestor-query-legend.pdf?
- which command do you launch to reproduce the issue?
- on the machines were you reproduce the issue, does it happen deterministically or randomly once every N runs?
- when you say it happens on two of your machines, does it mean there are machines where you never observed it?
- does package texlive-bin 2017.44590-4 change anything to the above questions?
- are the three errors (corrupted size, SIGSEGV, assertion failed) mentioned in the bug description distinct errors that appeared in 3 different runs ?

Do you run a "publicly" available compilation server receiving arbitrary input from the outside ?
Comment by Jon Gjengset (Jonhoo) - Tuesday, 26 September 2017, 14:41 GMT
- Yes, I am using x86_64.
- Yes, I see the issue with exactly the files uploaded.
- `pdflatex paper.tex`
- It happens deterministically. However, if I change even small things in the file, it can make the issue go away. Or the error changes. For example, removing either figure, setting an empty title or author, or removing [twocolumn] all make the document compile fine.
- I have not had a chance to try this on a third machine, so no.
- `texlive-bin 2017.44590-4` does not fix the issue.
- The three different errors occur with slightly different versions of paper.tex. corrupted size seems to be by far the most common one.

A fourth error case is that compilation simply hangs. This happened in my last run (again, with modifications to paper.tex), and gdb reports:
(gdb) i threads
Id Target Id Frame
* 1 Thread 0x7f1fed9fb780 (LWP 24772) "pdflatex" 0x00007f1fea58642a in __pthread_mutex_lock_full () from /usr/lib/libpthread.so.0
(gdb) bt
#0 0x00007f1fea58642a in __pthread_mutex_lock_full () from /usr/lib/libpthread.so.0
#1 0x00007f1fed000714 in Dict::decRef() () from /usr/lib/libpoppler.so.70
#2 0x00007f1fed068baa in Object::free() () from /usr/lib/libpoppler.so.70
#3 0x00007f1fed06bbd1 in PageAttrs::~PageAttrs() () from /usr/lib/libpoppler.so.70
#4 0x00007f1fed06ce4b in Page::~Page() () from /usr/lib/libpoppler.so.70
#5 0x00007f1fecff3e5a in Catalog::~Catalog() () from /usr/lib/libpoppler.so.70
#6 0x00007f1fed072c8e in PDFDoc::~PDFDoc() () from /usr/lib/libpoppler.so.70
#7 0x000055d418aadc4c in delete_document(PdfDocument*) ()
#8 0x000055d418a9c85a in deleteimage ()
#9 0x000055d418a7db65 in zpdfshipout ()
#10 0x000055d418a8d68b in maincontrol ()
#11 0x000055d418a3caba in mainbody ()
#12 0x000055d418a27a2f in main ()
Comment by Jon Gjengset (Jonhoo) - Tuesday, 26 September 2017, 14:41 GMT
And no, I do not run a compilation server that receives untrusted input.
Comment by Rémy Oudompheng (remyoudompheng) - Tuesday, 26 September 2017, 19:48 GMT
Thanks for the information. Although I am still totally unable to reproduce the crash (including after fiddling with paper.tex contents), I can identify a double-free from valgrind analysis.
Comment by Jon Gjengset (Jonhoo) - Tuesday, 26 September 2017, 20:02 GMT
Hmm, it's so strange that it doesn't reproduce on your side. I'd be happy to run additional analysis on my side if there's anything in particular you'd like me to try?
Comment by Rémy Oudompheng (remyoudompheng) - Tuesday, 26 September 2017, 20:22 GMT
Are you able to recompile texlive-bin with the attached patch ?

According to valgrind, it reduces the number of double-free errors during "pdflatex paper.tex" to 6 from 90 in previous package versions. I am not able to reduce the error count to zero easily. The errors arose from an incorrect migration to the new poppler API (introduced in poppler 0.58).

If it solves your issue I will integrate that patch version in the package.
Comment by Rémy Oudompheng (remyoudompheng) - Tuesday, 26 September 2017, 20:38 GMT
I also precompiled a package for you, with GPG signatures, available at:

https://pkgbuild.com/~remy/texlive-experimental/
Comment by Jon Gjengset (Jonhoo) - Tuesday, 26 September 2017, 21:12 GMT
Yes, the precompiled package at https://pkgbuild.com/~remy/texlive-experimental/ fixes the problem for me!
Both for the test case, and for the larger document in which I originally encountered the issue.
Thank you for sticking this one out!

Loading...