Tesseract

Posted by

Vincenzo Rubano

on Thursday, July 14, 2022 · one minute reading.

Initially developed by Hewlett-Packard (HP), and after by Google, Tesseract is an open source project that provides two different (yet related) things:

an OCR engine (called libtesseract), available as a framework;
a command line program (called tesseract), that allows performing a complete OCR process leveraging the features provided by the “libtesseract” framework.

Being developed in C and C++, Tesseract can be used on many different platforms. Prebuilt packages for supported platforms are provided. Unsurprisingly, many bindings to use libtesseract with a wide variety of programming languages have been developed as well. Different graphical OCR solutions that use tesseract “under the hood” to perform the OCR process are available too.

View Tesseract

Filed under:

Assistive Technologies
Accessibility and Linux
Accessibility in Mac OS
Optical Character Recognition (OCR)
Accessibility and Microsoft Windows