IOcrDocument Interface

Summary

Defines an OCR document object.

Syntax

Objective-C

C++

Java

public interface IOcrDocument : IDisposable

Public Interface IOcrDocument  
   Inherits System.IDisposable

@interface LTOcrDocument : NSObject

public class OcrDocument

public interface class IOcrDocument : public System.IDisposable

Remarks

The IOcrDocument object holds the recognition data for one or more pages and is used to convert this data to the final output document.

For information on how to create memory-based or file-based documents or how to load file-based documents from disk refer to IOcrDocumentManager.CreateDocument and Programming with the LEADTOOLS .NET OCR.

Typical OCR operation using IOcrEngine involves starting up the engine and then creating an IOcrDocument object using the CreateDocument method before adding the pages into it and performing either automatic or manual zoning. Once this is done, use the IOcrPage.Recognize method on each page to collect the recognition data and store it internally in the page. After the recognition data is collected, use the various IOcrDocument.Save methods to save the document to its final format. You can also use the various IOcrDocument.SaveXml methods to save the document as XML. For more information, refer to OcrXmlOutputOptions.

Use IOcrDocument.Save as many times as required to save the document to multiple formats such PDF, DOC and HTML (As well as XML through the IOcrDocument.SaveXml method). You can also continue to add and recognize pages (through the IOcrPage.Recognize method after you save the document.

For each IOcrPage that is not recognized (the user did not call Recognize and the value of the page IOcrPage.IsRecognized is still false) the IOcrDocument will insert a raster-only page into the final document.

To get the low level recognition data including the recognized characters and their confidence, use IOcrPage.GetRecognizedCharacters instead.

The IOcrDocument interface implements IDisposable, hence you must dispose the IOcrDocument object as soon as you are finished using it. Disposing an IOcrDocument object will free all the pages stored inside its IOcrDocument.Pages collection.

Some OCR engine types support creating multi-threaded documents by creating one IOcrEngine and multiple IOcrDocument or IOcrAutoRecognizeJob each in its own dedicated threads. For more information, refer to Multi-Threading with LEADTOOLS OCR.

IOcrDocument.IsInMemory will be true for memory-based documents and false for file-based documents.

IOcrDocument.FileName can be used to obtain the name of the disk file used by a file-based document.

Example

For an example, refer to IOcrDocumentManager and IOcrEngine.

Requirements

Target Platforms