Introduction
LEADTOOLS OCR Module - OmniPage Engine C API Introduction
The LEADTOOLS OCR Module - OmniPage Engine C API extends the functionality of LEADTOOLS SDKs by providing properties, methods, and events for easily incorporating optical and magnetic ink character recognition (OCR / MICR) into applications. Superior OCR processing speeds are exceptionally useful in form recognition and processing applications. Fax, dot matrix and halftones can be preprocessed to improve recognition results
Key Features
- Recognize a variety of documents, including facsimiles, photocopies and documents with complex layouts.
- Select the language to be recognized, and recognize multiple languages within one document. Choose from English, Danish, Dutch, Finnish, French, German, Italian, Norwegian, Portuguese, Russian, Spanish, or Swedish.
- Support for dialects such as US. English, French Canadian, Latin American Spanish, Swiss German, and Brazilian Portuguese is also provided.
- Recognize text from 5 to 72 points in virtually any typeface, and export to more than 40 different text, word processing, database, or spreadsheet output formats, including MS Word, MS Excel, PDF and more.
- Recognize special characters, such as MICR (Magnetic Ink Character Recognition), including bank routing indicators, check amounts, and customer account numbers.
- PDF and PDF/A output, ICR and OMR support is also available.
- Recognize multiple document pages at once and save recognition result to a single file.
- Add page(s) to the internal OCR list of pages.
- Process both text and halftone graphics. The recognition software's ability to distinguish halftone graphics from text can provide the basis of a compound document processing system.
- Each document may contain multiple OCR zones, and each zone may use any of the specialized OCR recognition engines (modules): MOR, MTX, and FireWorX.
- Segment complex pages manually or automatically into text zones, image zones, table zones, lines, headers and footers.
- Display document pages with or without their zones.
- Recognize text and colors within tables.
- Auto-orientate documents for improved recognition.
- Automatically detect fax, dot matrix, and other degraded documents and compensate accordingly.
- Correct document characteristics such as noise, darkness, lightness to achieve the best possible character recognition.
- Increase recognition accuracy with built-in lexical classes and user-defined lexicons (dictionaries)
- Set desired accuracy (confidence) thresholds prior to recognition to balance speed and accuracy of recognition.
Supported Environments
See Also
Reference
API Overview
Getting Started
Programming with LEADTOOLS OCR
An Overview of Recognition Modules
Demo Programs
Tutorials
Version History
LEADTOOLS OCR Module - OmniPage Engine C API Changes