FS#71435 - [paperwork] should have tesseract as dependency to let OCR work

Attached to Project: Community Packages
Opened by allexj (allexj) - Sunday, 04 July 2021, 12:28 GMT
Last edited by Toolybird (Toolybird) - Sunday, 11 June 2023, 23:05 GMT
Task Type Bug Report
Category Packages
Status Closed
Assigned To Balló György (City-busz)
Architecture All
Severity Low
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 1
Private No

Details

Description:
Redo OCR does not work on PDF pages with images.
More infos: https://gitlab.gnome.org/World/OpenPaperwork/paperwork/-/issues/992
To let it work I had to execute paperwork-gtk chkdeps and then I had to install tesseract tesseract-data-ita in my case

Additional info:
* package version(s): 2.0.3
* link to upstream bug report: https://gitlab.gnome.org/World/OpenPaperwork/paperwork/-/issues/992
This task depends upon

Closed by  Toolybird (Toolybird)
Sunday, 11 June 2023, 23:05 GMT
Reason for closing:  Not a bug
Additional comments about closing:  See comments
Comment by Caleb Maclennan (alerque) - Tuesday, 01 February 2022, 07:38 GMT
It sounds like it should actually be an `optdepends=()`, is that right? In other words some features work fine without, but for some functions you need to add tesseract.
Comment by Toolybird (Toolybird) - Thursday, 11 May 2023, 04:49 GMT
tesseract is already an optdep of python-pyocr which is a direct dep of paperwork. In other words, when installing paperwork, you would see "tesseract: OCR backend" in the pacman output which users are expected to pay attention to. Opinions seem to differ amongst Arch PM's whether this kind of thing should be a direct optdep or an indirect optdep (which it is currently). I'm leaning towards closing this...unless anyone begs to differ?

Loading...