LEAD Technologies, Inc

Recognizing OCR Pages

Each zone on a page has a recognition module associated with it through the OcrZone.RecognitionModule property. This recognition module provides information about the type of information contained in the zone and how to recognize that data. Depending on the type of recognition module, there may be additional options available for use during recognition. These options are usually engine-specific, for a list of engine specific feature and options, refer to OCR Engine Specific Settings.

For some general information about available recognition modules, refer to An Overview of OCR Recognition Modules.

You can modify the following properties before starting the recognition process:

The Leadtools.Forms.Ocr.IOcrSpellCheckManager.SpellCheckEngine will be used to enable or disable the checking sub-system, which will be used in verification.

When all necessary recognition options have been set, the page(s) can be recognized by calling IOcrPage.Recognize.

After recognition is complete, the recognized characters can be obtained and the recognition results can be saved to a file, a .NET stream or to memory.

The collection of characters recognized for a specific page can be obtained using IOcrPage.GetRecognizedCharacters. You can inspect this collection of characters and even modify it and update the recognition data with the IOcrPage.SetRecognizedCharacters method.

The recognition results can be saved to a file or a .NET stream by calling IOcrDocument.Save. This method takes an parameter to specify the type of the document format to save (PDF, DOC, TXT, etc). LEADTOOLS .NET OCR uses the Leadtools.Forms.DocumentWriters assembly to save the OCR results to an output file.

In addition to the various formats supported by the document writers, the recognition results can also be saved as XML using IOcrDocument.SaveXml.

The recognition results can also be obtained directly into a simple .NET string object calling the IOcrPage.RecognizeText method.

Finally, to get or set special characters used in the recognition process, use the IOcrDocumentManager.RejectionSymbol and the related IOcrDocumentManager.MissingSymbol properties.

 

 


Products | Support | Contact Us | Copyright Notices

© 2006-2012 All Rights Reserved. LEAD Technologies, Inc.