public IOcrDocument CreateDocument(
string documentFileName,
OcrCreateDocumentOptions options
)
documentFileName
The document file name. This value can be null.
options
Options to control how the document is created or loaded.
An object implementing IOcrDocument that can participate in recognition and saving operations.
This method can either create a file or memory-based OCR document, or load a previously created file-based document based on the values of documentFileName and options as follows:
To create a memory-based document, pass OcrCreateDocumentOptions.InMemory to options. documentFileName is not used and the engine will not use a disk file to store the document data.
To create a file-based document that will be not be re-used, pass null to documentFileName and OcrCreateDocumentOptions.AutoDeleteFile to options. In this case, the engine will create a temporary file on disk to use as the store for the document file. The file is deleted when the IOcrDocument is disposed. Note that you use your own file name in documentFileName along with OcrCreateDocumentOptions.AutoDeleteFile, the engine will overwrite this file if it exists and automatically deletes it when disposed.
To create a file-based document that will be re-used, pass a file name to documentFileName and OcrCreateDocumentOptions.None to options. In this case, the engine will overwrite this file if it exists but will not delete it when IOcrDocument is disposed.
To re-load a document that was created with the previous option, pass the same file name to documentFileName and OcrCreateDocumentOptions.LoadExisting to options. In this case, the engine will re-generate the document from data found in the file.
Use IOcrDocument.IsInMemory to test whether a document is memory or file-based and IOcrDocument.FileName to get the name of the disk-file used by a file-based document. This will be set to the same value passed to documentFileName or the name of the temp file created.
For more information on memory and file-based documents, refer to Programming with the LEADTOOLS .NET OCR.
Typical OCR operation using the IOcrEngine involves starting up and then creating an OCR document using the CreateDocument method then adding the pages into it and perform either automatic or manual zoning. Once this is done, IOcrPage.Recognize is called on each page to collect the recognition data and have it stored internally in the page. After the recognition data is collected, you use the various IOcrDocument.Save or IOcrDocument.SaveXml methods to save the document to its final format.
When you are done using the IOcrDocument object created by this method, you should dispose it as soon as possible to free its resources. Disposing an IOcrDocument object will free all the pages stored inside its IOcrDocument.Pages collection.
using Leadtools;
using Leadtools.Codecs;
using Leadtools.Ocr;
using Leadtools.Document.Writer;
public void StartupEngineExample()
{
// Use RasterCodecs to load an image file
// Note: You can let the engine load the image file directly as shown in the other examples
RasterCodecs codecs = new RasterCodecs();
RasterImage image = codecs.Load(Path.Combine(LEAD_VARS.ImagesDir, "Ocr1.tif"));
// Assume you copied the engine runtime files to C:\MyApp\Ocr
string engineDir = @"C:\MyApp\Ocr";
// Store the engine work directory into a path inside our application
string workDir = @"C:\MyApp\OcrTemp";
// Delete all files in the work directory in case the previous version of our application exited abnormally and
// the engine did not get the chance to clean all of its temporary files (if any)
Directory.Delete(workDir, true);
// Re-create the work directory
Directory.CreateDirectory(workDir);
// Create an instance of the engine
using (IOcrEngine ocrEngine = OcrEngineManager.CreateEngine(OcrEngineType.LEAD))
{
// Show that the engine has not been started yet
Console.WriteLine("Before calling Startup, IsStarted = " + ocrEngine.IsStarted);
// Start the engine using our parameters
// Since we already have a RasterCodecs object, we can re-use it to save memory and resources
ocrEngine.Startup(codecs, null, workDir, engineDir);
// Make sure the engine is using our working directory
Console.WriteLine("workDir passed is {0}, the value of WorkDirectory after Startup is {1}", workDir, ocrEngine.WorkDirectory);
// Show that the engine has started fine
Console.WriteLine("After calling Startup, EngineType is {0}, IsStarted = {1}", ocrEngine.EngineType, ocrEngine.IsStarted);
// Maks sure the engine is using our own version of RasterCodecs
Debug.Assert(codecs == ocrEngine.RasterCodecsInstance);
// Create a page from the raster image as page to the document
IOcrPage ocrPage = ocrEngine.CreatePage(image, OcrImageSharingMode.AutoDispose);
// image belongs to the page and will be dispose when the page is disposed
// Recognize the page
// Note, Recognize can be called without calling AutoZone or manually adding zones. The engine will
// check and automatically auto-zones the page
ocrPage.Recognize(null);
// Create a file based document
using (IOcrDocument ocrDocument = ocrEngine.DocumentManager.CreateDocument(null, OcrCreateDocumentOptions.AutoDeleteFile))
{
// Add the page
ocrDocument.Pages.Add(ocrPage);
// No need for the page anymore
ocrPage.Dispose();
// Save the document we have as PDF
string pdfFileName = Path.Combine(LEAD_VARS.ImagesDir, "Ocr1.pdf");
ocrDocument.Save(pdfFileName, DocumentFormat.Pdf, null);
}
// Shutdown the engine
// Note: calling Dispose will also automatically shutdown the engine if it has been started
ocrEngine.Shutdown();
}
}
static class LEAD_VARS
{
public const string ImagesDir = @"C:\LEADTOOLS23\Resources\Images";
}