In This Topic ▼

Recognizing OCR Pages

You can modify the following properties before starting the recognition process:

Leadtools.Ocr.IOcrSpellCheckManager.SpellCheckEngine

The Leadtools.Ocr.IOcrSpellCheckManager.SpellCheckEngine will be used to enable or disable the checking sub-system, which will be used in verification.

When all necessary recognition options have been set, the page(s) can be recognized by calling IOcrPage.Recognize.

After recognition is complete, the recognized characters can be obtained and the recognition results can be saved to a file, a .NET stream or to memory.

The collection of characters recognized for a specific page can be obtained using IOcrPage.GetRecognizedCharacters. You can inspect this collection of characters and even modify it and update the recognition data with the IOcrPage.SetRecognizedCharacters method.

The recognition results can be saved to a file or a .NET stream by calling IOcrDocument.Save. This method takes an DocumentFormat parameter to specify the type of the document format to save (PDF, DOC, TXT, etc.). LEADTOOLS .NET OCR uses the Leadtools.Document.Writer assembly to save the OCR results to an output file.

In addition to the various formats supported by the document writers, the recognition results can also be saved as XML using IOcrDocument.SaveXml.

The recognition results can also be obtained directly into a simple .NET string object calling the IOcrPage.GetText method.

Finally, to get or set special characters used in the recognition process, use the IOcrDocumentManager.RejectionSymbol and the related IOcrDocumentManager.MissingSymbol properties.