uiucprescon.ocr Package

Public API

class uiucprescon.ocr.Reader(language_code, tesseract_data_path)

Reading the text from an image file

Note

A Reader object should not be generated directly. Instead, it should be constructed using the Engine class’s Engine.get_reader() method.

read(file: str)

Generate text from an image

Parameters:file – File path to an image
Returns:Text extracted from an image
class uiucprescon.ocr.Engine(data_set_path)

The engine for driving the ocr processing

get_reader(lang: str) → uiucprescon.ocr.reader.AbsReader

Builder method for creating reader objects for a specific language

Parameters:lang – letter code that represents the language for a tesseract data set.
Returns:Constructs a Reader object which can be used for extracting text from and image.
get_version() → str

Check the version of Tesseract that this python package is linked to. An example value might be the string “3.05.02”.