FS#41746 - [tesseract] tesseract 3.03rc1-1 complains about missing osd.traineddata
Attached to Project:
Community Packages
Opened by Matt (madalu) - Friday, 29 August 2014, 03:26 GMT
Last edited by Sergej Pupykin (sergej) - Tuesday, 03 March 2015, 19:06 GMT
Opened by Matt (madalu) - Friday, 29 August 2014, 03:26 GMT
Last edited by Sergej Pupykin (sergej) - Tuesday, 03 March 2015, 19:06 GMT
|
Details
Description: default hocr configuration does not work.
When generating hocr output, tesseract fails and complains about a missing /usr/share/tessdata/osd.traineddata. This is because /usr/share/tessdata/configs/hocr has an additional line as of 3.03rc1-1: tessedit_pageseg_mode 1 However, this will not work on arch, as there is no osd data packaged for arch. Here is the error message: out.001 - Tesseract Open Source OCR Engine v3.03 with Leptonica Error opening data file /usr/share/tessdata/osd.traineddata Please make sure the TESSDATA_PREFIX environment variable is set to the parent directory of your "tessdata" directory. Failed loading language 'osd' Tesseract couldn't load any languages! Warning: Auto orientation and script detection requested, but osd language failed to load Additional info: * package version(s): 3.03rc1-1 - I also have tesseract-data-eng, tesseract-data-deu, and tesseract-data-france installed * config and/or log files etc.: no special config Steps to reproduce: call tesseract on scanned pnm file, - e.g., "tesseract out.001.pnm ocr hocr" This command works with no problems with 3.02.02-4. However, with 3.03rc1-1, tesseract complains (see error message above). |
This task depends upon
Closed by Sergej Pupykin (sergej)
Tuesday, 03 March 2015, 19:06 GMT
Reason for closing: Fixed
Additional comments about closing: traning utils and osd.traineddata added
Tuesday, 03 March 2015, 19:06 GMT
Reason for closing: Fixed
Additional comments about closing: traning utils and osd.traineddata added
If someone has the authority to change it, I would much appreciate it.
As a workaround until the maintainer fixes this one can download/untar the file manually and copy its content to /usr/share/tessdata/
From
https://code.google.com/p/tesseract-ocr/wiki/Compiling
-------------
If you want the training tools (3.03), you will also need to run the following commands:
make training
sudo make training-install
Build of training tools is not available if you do not have necessary dependencies (pay attention to messages from ./configure script).
--------------