The LEADTOOLS OCR Engine can deliver precise coordinate, confidence and attribute data for each recognized character, giving the application great control over the formatting of the output text - at one extreme mirroring the input document, at the other permitting a unique user-defined style.
LEADTOOLS OCR supports the output of many different file formats, including:
Adobe PDF * |
Displaying the generated PDF file in a PDF-reader results in a very similar look to the original document. The text can be searched. The PDF file contains the recognized characters in the same positions as in the original. |
Text - Standard |
Text output with line break after each line. If table is present, its cells are positioned by TABs |
Text - Smart |
Text output with line break after each line. Left margin is taken into account (with SPACEs) If a table is present; its cells are positioned by SPACEs. |
Text - Stripped |
Text output with line break after each paragraph. If table is present, its cells are separated by TABs. |
Text - Plain |
Text output with line break after each line. Left and Upper margins are is taken into account (with SPACEs and NEWLINEs) If table is present, its cells are positioned by TABs. |
Text - Comma Delimited |
Comma delimited text output. Line/cell contents are surrounded by quotes (""). The default delimiter (comma) can be overridden. |
Text - Tab Delimited |
TAB separated text output. Line/cell contents are surrounded by quotes ("") |
Rec ASCII (Formatted) |
Text output, layout retention with mimicked SPACEs. Line/cell contents are surrounded by quotes ("") |
Rec ASCII (Standard) |
Text output allowing quick text conversion. |
Rec ASCII (StandardEx) |
Text output allowing quick text conversion. Line break after each line and after each zone. |
General Word Processor |
Text output allowing quick text conversion. Line break after each paragraph. |
HTML 3.2 |
HTML output. HTML 3.2 is useful to export with partial formating. The output files support both IE and Netscape. |
HTML 4.0 |
HTML output.HTML 4.0 can set the exact position/size of objects, use this output format with full formatting. |
Word 97, 2000, XP |
Microsoft Word 97, Word 2000 and Word XP output format. |
Excel 97, 2000 |
Microsoft Excel 97 and Excel 2000 output format. |
WordPerfect 8 |
WordPerfect 8 format. |
Rich Text Format |
Quick conversion to Rich Text Format. |
PowerPoint 97 (RTF) |
Rich Text Format for PowerPoint 97 |
Publisher 98 (RTF) |
Rich Text Format for Publisher 98 |
WordPad (RTF) |
Rich Text Format for WordPad |
RTF Word 2000 |
Rich Text Format for Word 2000 |
RTF Word 97 |
Rich Text Format for Word 97 |
RTF Word 6.0/95 |
Rich Text Format for Word 6.0/95 |
Open eBook 1.0 |
Open eBook 1.0 format |
XML |
XML output format conforming with ScanSoft's schema file SSDOC-SCHEMA2.xml http://www.scansoft.com/omnipage/xml/SSDOC-SCHEMA2.xml |
2G Type 2 |
Binary output of the recognition with a 16-byte long structure for each recognized character.2G Type 2 structure output |
2G Type 3 |
Binary output of the recognition with a 16-byte long structure for each recognized character.2G Type 3 structure output |
* LEADTOOLS PDF OCR Plug-in is required to output PDF.
More: