FS#71856 - [leptonica] 1.81 breaks (some) monochrome TIFF->PDF conversion
Attached to Project:
Community Packages
Opened by Alexander Kobel (akobel) - Tuesday, 17 August 2021, 16:16 GMT
Last edited by Toolybird (Toolybird) - Sunday, 14 May 2023, 01:27 GMT
Opened by Alexander Kobel (akobel) - Tuesday, 17 August 2021, 16:16 GMT
Last edited by Toolybird (Toolybird) - Sunday, 14 May 2023, 01:27 GMT
|
Details
Leptonica 1.81 breaks (among others) TIFF to PDF conversion,
see
https://github.com/DanBloomberg/leptonica/issues/586
for a discussion of the issue including fully automated
tests.
A prominent caller of this function is the widely used tesseract OCR and several document scanning and post-processing tools that use tesseract internally. Reverting commit 2881dfb049aea0821b506e5a5ed0048eef749c04 from upstream fixes the issue. This was a commit introduced to 1.81.0, meant as a pure performance optimization for embedding CCITT Group4-compressed monochrome TIFF images; but it does only work for a subset of valid files (essentially, ones created by unwrapping them from PDF). The performance gain is not tremendous in many cases (e.g., negligible compared to OCR or non-trivial image processing), and certainly not worth risking correctness. It would be nice if this hotfix could be applied as an interim measure until a long-term solution is found upstream. On a side note, leptonica is currently built without openjpeg2 dependency and, hence, without JPEG2k support; it'd be nice if this could be included. Attached is a corresponding PKGBUILD that covers both points. |
This task depends upon
Closed by Toolybird (Toolybird)
Sunday, 14 May 2023, 01:27 GMT
Reason for closing: Fixed
Additional comments about closing: We're on leptonica 1.83.1-1 now so assuming fixed.
Sunday, 14 May 2023, 01:27 GMT
Reason for closing: Fixed
Additional comments about closing: We're on leptonica 1.83.1-1 now so assuming fixed.
The final upstream fix (slightly more elaborate, and slightly more performant under certain condition) will probably only be in 1.82.0 to be released soon-ish.