OCR
LEADTOOLS' OCR engine includes preset confidence and accuracy levels, artificial intelligence, and built-in and user-defined lexicons for limiting the type of text to recognize within a particular zone. LEADTOOLS provides the ability to verify or correct text during, and (using LEAD's unique OCR editor) after recognition. LEAD's OCR editor ties the text being edited directly to the image, providing a visual reference to the original bitmap data. The OCR engine can perform Automatic area segmentation creating multi-layered zones, recognizing areas such as tables, rules, images and text. Or, you can manually designate up to 250 such zones.
Different fonts, sizes (5 to 72 point) and styles are also supported. Fax, dot matrix and halftones can be preprocessed to improve recognition results. The OCR Engine supports major European and Scandinavian languages (Danish, Dutch, Finnish, French, German, Italian, Norwegian, Portuguese, Russian, Spanish, and Swedish) as well as English. Support for dialects such as US. English, French Canadian, Latin American Spanish, Swiss German, and Brazilian Portuguese is also provided. Recognized text can be exported to more than 40 different formats, including MS Word, MS Excel, Dbase and WordPerfect. You get superior OCR processing speeds, for use in form recognition and processing applications. The OCR features extend the functionality of the LEADTOOLS Document Imaging Suite by providing properties, methods, and events for easily incorporating the Xerox TextBridge optical character recognition engine into your applications.
These functions, properties and methods feature:
[Document Imaging Suite ActiveX32 only] Recognize and export text, choosing from a variety of text, word processing, database, or spreadsheet file formats.
[Document Imaging Suite ActiveX32 only] Select the language of documents to be recognized. Choose from English, Danish, Dutch, Finnish, French, German, Italian, Norwegian, Portuguese, Russian, Spanish, or Swedish.
[Document Imaging Suite ActiveX32 only] Segment complex pages manually or automatically into text zones, image zones, table zones, lines, headers and footers.
[Document Imaging Suite ActiveX32 only] Set accuracy thresholds prior to recognition to control the accuracy of recognition.
[Document Imaging Suite ActiveX32 only] Learn, save and load character recognition data for similar documents. The software learns as a result of normal recognition, and acquires additional information by using the OCR's text verification system.
[Document Imaging Suite ActiveX32 only] Recognize text from 5 to 72 points in virtually any typeface.
[Document Imaging Suite ActiveX32 only] Increase recognition accuracy with built-in lexical classes and user defined lexicons.
[Document Imaging Suite ActiveX32 only] Verify or correct text during the recognition process based on confidence levels set prior to recognition. If a word or character falls within the set range, a dialog box can be brought up to allow the user to see the original image and the preliminary results of the recognition. From the dialog box, the user may make any necessary corrections to the recognized text.
[Document Imaging Suite ActiveX32 only] Once recognition is complete, edit the entire document, using the LEADTOOLS OCR Editor.
[Document Imaging Suite ActiveX32 only] Automatically detect fax, dot matrix, and other degraded documents and compensate accordingly.
[Document Imaging Suite ActiveX32 only] Process both text and halftone graphics. The recognition software's ability to distinguish halftone graphics from text can provide the basis of a compound document processing system.
Recognized text can be exported to more than 40 different formats, including MS Word, MS Excel, Dbase and WordPerfect.