The award winning LEADTOOLS OCR Modules provide methods for incorporating optical character recognition (OCR) technology into an application and include everything needed to develop robust, high performance and scalable document imaging solutions that include optical character recognition technology.
An important feature of the OCR design is the support of multiple OCR engines. The ability to choose the right engine for a specific solution gives unprecedented flexibility to developers. To reduce complexity and overall development time, the design hides the underlying engine details through the use of a common .NET class library. Changing underlying OCR engines based on the requirements of the project requires virtually no change to the application code.
In addition, LEAD also offers a wide range of functionality that may be added to LEADTOOLS as needed. These advanced features include barcode detection, recognition and printing, forms registration, recognition and processing, Optical Mark Recognition (OMR), and handwritten text recognition (ICR).
Key Features:
Each of the following LEADTOOLS OCR Modules include interfaces that greatly simplify coding and speeds development of OCR applications. Additionally, the same code can be used with any of LEADTOOLS OCR Modules.
The Plus and Pro versions of the LEADTOOLS OCR module include several OCR subsystems. These OCR subsystems are utilized internally to apply powerful two and three way voting techniques to ensure the most accurate results possible. Additionally, the Plus and Pro versions include state of the art auto image enhancements to ensure high performance and low error rates for any type of image composition or source including faxed and dot matrix printed images.
LEADTOOLS OCR Module - Plus
Includes automatic and manual zone detection, formatted output, auto-orientation, custom spelling dictionaries and MICR support. *PDF and *PDF/A output, ICR and OMR support is available.
LEADTOOLS OCR Module - Professional
Includes the fastest and most accurate automatic and manual zone detection, unicode support, formatted output, auto-orientation, custom spelling dictionaries and MICR support. *PDF and *PDF/A output, ICR and OMR support is available. Asian language support is also available as an add-on.
LEADTOOLS OCR Module - Advantage
Includes automatic and manual zone detection, formatted output, auto-orientation support. *PDF and *PDF/A output and OMR support is available.
LEADTOOLS OCR Module - Arabic
Provides an Arabic language OCR engine with support for automatic and manual zone detection, formatted output and auto-orientation. *PDF and *PDF/A output, and OMR support is available.
The LEADTOOLS OCR Engine can deliver precise coordinate, confidence and attribute data for each recognized character, giving the application great control over the formatting of the output text - at one extreme mirroring the input document, at the other permitting a unique user-defined style.
LEADTOOLS OCR supports the output of many different file formats, including:
Adobe PDF * | Displaying the generated PDF file in a PDF-reader results in a very similar look to the original document. The text can be searched. The PDF file contains the recognized characters in the same positions as in the original. |
Text - Standard | Text output with line break after each line. If table is present, its cells are positioned by TABs |
Text - Smart | Text output with line break after each line. Left margin is taken into account (with SPACEs) If a table is present; its cells are positioned by SPACEs. |
Text - Stripped | Text output with line break after each paragraph. If table is present, its cells are separated by TABs. |
Text - Plain | Text output with line break after each line. Left and Upper margins are is taken into account (with SPACEs and NEWLINEs) If table is present, its cells are positioned by TABs. |
Text - Comma Delimited | Comma delimited text output. Line/cell contents are surrounded by quotes (""). The default delimiter (comma) can be overridden. |
Text - Tab Delimited | TAB separated text output. Line/cell contents are surrounded by quotes ("") |
Rec ASCII (Formatted) | Text output, layout retention with mimicked SPACEs. Line/cell contents are surrounded by quotes ("") |
Rec ASCII (Standard) | Text output allowing quick text conversion. |
Rec ASCII (StandardEx) | Text output allowing quick text conversion. Line break after each line and after each zone. |
General Word Processor | Text output allowing quick text conversion. Line break after each paragraph. |
HTML 3.2 | HTML output. HTML 3.2 is useful to export with partial formatting. The output files support both IE and Netscape. |
HTML 4.0 | HTML output.HTML 4.0 can set the exact position/size of objects, use this output format with full formatting. |
Word 97, 2000, XP | Microsoft Word 97, Word 2000 and Word XP output format. |
Excel 97, 2000 | Microsoft Excel 97 and Excel 2000 output format. |
WordPerfect 8 | WordPerfect 8 format. |
Rich Text Format | Quick conversion to Rich Text Format. |
PDF\A |
Image on text. |
PowerPoint 97 (RTF) |
Rich Text Format for PowerPoint 97. |
Publisher 98 (RTF) | Rich Text Format for Publisher 98. |
WordPad (RTF) | Rich Text Format for WordPad. |
RTF Word 2000 | Rich Text Format for Word 2000. |
RTF Word 97 | Rich Text Format for Word 97. |
RTF Word 6.0/95 | Rich Text Format for Word 6.0/95. |
Open eBook 1.0 | Open eBook 1.0 format. |
XML | XML output format |
XPS | XML Paper Specification format. |
2G Type 2 | Binary output of the recognition with a 16-byte long structure for each recognized character.2G Type 2 structure output. |
2G Type 3 | Binary output of the recognition with a 16-byte long structure for each recognized character.2G Type 3 structure output. |
* LEADTOOLS PDF OCR Module is required to output PDF.
Supported Platforms