Tesseract
Posted by
Vincenzo Rubanoon · one minute reading.Initially developed by Hewlett-Packard (HP), and after by Google, Tesseract is an open source project that provides two different (yet related) things:
- an OCR engine (called libtesseract), available as a framework;
- a command line program (called tesseract), that allows performing a complete OCR process leveraging the features provided by the “libtesseract” framework.
Being developed in C and C++, Tesseract can be used on many different platforms. Prebuilt packages for supported platforms are provided. Unsurprisingly, many bindings to use libtesseract with a wide variety of programming languages have been developed as well. Different graphical OCR solutions that use tesseract “under the hood” to perform the OCR process are available too.