The LEADTOOLS OCR SDK provides functions for incorporating optical character recognition (OCR) technology into an application. OCR is used to process bitmap document images into text. LEADTOOLS OCR modules provide functions to:
Recognize and export text, choosing from a variety of text, word processing, database, or spreadsheet file formats.
Select the language of documents to be recognized. Choose from English, Danish, Dutch, Finnish, French, German, Italian, Norwegian, Portuguese, Russian, Spanish, or Swedish.
Segment complex pages manually or automatically into text zones, image zones, table zones, lines, headers and footers.
Set accuracy thresholds prior to recognition to control the accuracy of recognition.
Recognize text from 5 to 72 points in virtually any typeface.
Automatically detect fax, dot matrix, and other degraded documents and compensate accordingly.
Process both text and graphics. The recognition software's ability to distinguish halftone graphics from text can provide the basis of a compound document processing system.
Save the document in any of up to 40 file formats, including MS Word, MS Excel, Dbase, PDF and WordPerfect.
Increase recognition accuracy with built-in and user dictionaries.
✎ NOTE
User words and dictionaries are no longer supported in the LEADTOOLS OCR Module - OmniPage Engine.
LEADTOOLS uses an OCR handle to interact with the OCR engine and the internal OCR list of pages. The OCR handle is a communication session between the LEADTOOLS OCR and an OCR engine installed on the system. The OCR handle is an internal structure that contains all the necessary information for recognition, getting and setting information, and text verification.
To begin programming with LEADTOOLS OCR, you must first unlock the OCR features. Unless OCR features are unlocked, OCR properties, methods, and events will not be available.
Unlocking Special LEAD OCR Features