Introduction

LEADTOOLS OCR engines extend the functionality of LEADTOOLS SDKs by providing properties, methods, and events for easily incorporating optical and magnetic ink character recognition (OCR / MICR) into your applications. Superior OCR processing speeds are exceptionally useful in form recognition and processing applications. Fax, dot matrix and halftones can be preprocessed to improve recognition results

Features include preset confidence and accuracy levels, artificial intelligence, and built-in and user-defined lexicons for limiting the type of text to recognize within a particular zone. The OCR engines can perform automatic area segmentation to create multi-layered zones for recognition of areas such as tables, rules, images and text. Or, you can manually designate up to 250 such zones.

The OCR Engine supports multiple fonts, sizes (5 to 72 point) and styles for all major European and Scandinavian languages (Danish, Dutch, Finnish, French, German, Italian, Norwegian, Portuguese, Russian, Spanish, and Swedish) as well as English and MICR (Magnetic Ink Character Recognition). Support for dialects such as US. English, French Canadian, Latin American Spanish, Swiss German, and Brazilian Portuguese is also provided.

Recognized text can be exported to more than 40 different formats, including MS Word, MS Excel, Dbase and WordPerfect.

Key Features:

Run multiple OCR engines to continue support for existing applications while building fastest-possible support for new applications.
Recognize a variety of documents, including facsimiles, photocopies and documents with complex layouts.
Select the language to be recognized, and recognize multiple languages within one document. Choose from English, Danish, Dutch, Finnish, French, German, Italian, Norwegian, Portuguese, Russian, Spanish, or Swedish.
Recognize text from 5 to 72 points in virtually any typeface, and export to more than 40 different text, word processing, database, or spreadsheet output formats, including MS Word, MS Excel, Dbase and WordPerfect.
Recognize special characters, such as MICR (Magnetic Ink Character Recognition), including bank routing indicators, check amounts, and customer account numbers.
PDF and PDF/A output, ICR and OMR support is also available.
Recognize multiple document pages at once and save recognition result to a single file.
Add page(s) to the internal OCR list of pages.
Process both text and halftone graphics. The recognition software's ability to distinguish halftone graphics from text can provide the basis of a compound document processing system.
Each document may contain multiple OCR zones, and each zone may use any of the specialized OCR recognition engines (modules): MOR, MTX, and FireWorX.
Segment complex pages manually or automatically into text zones, image zones, table zones, lines, headers and footers.
Display document pages with or without their zones.
Recognize text and colors within tables.
Auto-orientate documents for improved recognition.
Automatically detect fax, dot matrix, and other degraded documents and compensate accordingly.
Correct document characteristics such as noise, darkness, lightness to achieve the best possible character recognition.
Increase recognition accuracy with built-in lexical classes and user-defined lexicons (dictionaries)
Set desired accuracy (confidence) thresholds prior to recognition to balance speed and accuracy of recognition.

Supported Environments

The toolkit comes in Win32 and x64 editions that can support development of software applications for any of the following environments:

Windows 7

Windows Vista

Windows XP

Windows 2000

For more information, refer to:

An Overview of Recognition Modules

Programming with LEADTOOLS OCR

Demo Programs

Tutorials