Multi-Threading with LEADTOOLS OCR

The .NET OCR class library (Leadtools.Forms.Ocr) provides a common gateway to the various OCR runtime engines available with LEADTOOLS. Use the OcrEngineManager.CreateEngine method with the runtime engine type required to obtain an instance of the IOcrEngine:


             // Create an instance of the OCR engine of a given type
             IOcrEngine ocrEngineInstance = OcrEngineManager.CreateEngine(ocrEngineType, useThunkServer)
             // Start up the engine and use it ...

From this instance, you can then obtain one or more instances of IOcrDocument (or IOcrAutoRecognizeManager, which creates IOcrDocuments internally) to perform all OCR operations needed: from loading source images such as TIF and raster PDF; zoning; to recognition and export to PDF, DOC, DOCX(2007/2010), HTML or TXT final document format. To do so, perform the following steps:


             // Create an instance of an OCR document from the engine
             IOcrDocument ocrDocument= ocrEngineInstance.DocumentManager.CreateDocument();
             // Add pages, zone them, recognize them and save them
             // to the final document:
             ocrDocument.Pages.AddPages(imageFileName, null);
             ocrDocument.Recognize(null);
             ocrDocument.Save(documentFileName, DocumentFormat.Pdf, null);


             // Use IOcrAutoRecognizeManager to automatically recognize a document
             IOcrAutoRecognizeJob ocrJob = ocrEngine.AutoRecognizeManager.CreateJob(
                new OcrAutoRecognizeJobData(imageFileName, DocumentFormat.Pdf, documentFileName));
             ocrEngine.AutoRecognizeManager.RunJob(ocrJob);

All of these operations are performed in an engine runtime-independent manner by making calls to IOcrEngine, IOcrAutoRecognizeManager, IOcrDocument and other interfaces which interact internally with the engine runtime to perform the action required.

LEADTOOLS supports the following engine runtimes:

Professional
Advantage
Arabic - Specific for Arabic language OCR

Contact LEADTOOLS support at https://www.leadtools.com for more information.

Depending on application requirements, platform and OCR engine runtime type; a "Thunk" mechanism may be required when using the LEADTOOLS OCR modules (not required when using the Advantage or Professional engines). The LEADTOOLS OCR Thunk Server is a COM+ object that can be used to host an instance of the internal OCR engine runtime in a separate process and provides support for the following:

Some OCR engines only have 32-bit (x86) runtimes that normally cannot be used by a 64-bit (x64) application. The Thunk Server internally marshals the calls between the 64-bit application and the 32-bit runtime seamlessly.
Some OCR engines are not thread-safe. Therefore, you cannot create more than one instance of an IOcrEngine in an application at the same time and in the same process. Using the Thunk Server removes this restriction since each instance of the OCR runtime is stored in a separate process. Note that you can create an instance of an IOcrEngine, dispose of it, and then create another instance without restriction.

The useThunkServer parameter of the OcrEngineManager.CreateEngine method controls when the Thunk Server is used, based on the following table:

Note: In this table "platform" means the application's native platform (whether it is x86 or x64, not the operating system).

OCR Engine Type	Platform	useThunkServer value	Notes
Advantage/Professional	x86	False/True	Ignored. These OCR engine runtimes are thread-safe. The Thunk Server is never used with these engines.
Advantage/Professional	x64	False/True	Ignored. These OCR engine runtimes are thread-safe and support x64 natively. The Thunk Server is never used with these engines.
Arabic	x86	False	The Thunk Server is not used and the engine runtime is loaded in the same process as the calling application. Use this if only one instance of IOcrEngine will be created at the same time and in the same process.
Arabic	x86	True	The Thunk Server is used and the engine runtime is loaded in a separate process from the calling application. Use this if multiple instances of IOcrEngine will be created at the same time and in the same process.
Arabic	x64	False	Ignored, the Thunk Server will always be used because the OCR runtime does not support x64 natively. Instead, the engine runtime will be loaded in a separate process. Thus, multiple IOcrEngine instance can be created at the same time by the calling process.
Arabic	x64	True	Ignored, the Thunk Server will always be used because the OCR runtime does not support x64 natively. Instead, the engine runtime will be loaded in a separate process. Thus, multiple IOcrEngine instances can be created at the same time by the calling process.

The LEADTOOLS OCR interfaces are designed to handle all data marshaling internally. Your application program will not change if the Thunk Server is used: you switch between using the Thunk Server and not using it by simply changing the value of useThunkServer.

All this marshaling behind the scenes can cause a performance penalty.Thus, it is best to design your application so it does not use the Thunk Server whenever possible. This can be accomplished using one of the following methods

Use either the Advantage or Professional OCR engine.
Develop for x86 only (which can run on x64 platforms) .
Use one of the multi-threading techniques, described below, that do not require the Thunk Server whenever multi-threading is needed in a server-based application.

Multi-threaded OCR application can be created two different ways:

Multi-threaded OCR Applications Using Multiple Engines

In this scenario, create and use a dedicated IOcrEngine instance in each thread. From the table above, it can be seen that useThunkServer must be true when creating these instances. Of course, in both the Advantage and Professional OCR engine cases, useThunkServer is never used.

All LEADTOOLS OCR engines in all platforms support this scenario.

The OCR Multi-threaded Demo source code shipping with LEADTOOLS shows an example of this scenario in the following cases:

The x64 version of the demo
In the x86 version, the Professional and Advantage OCR engines use the multi- IOcrDocument or multi- IOcrAutoRecognizeManager.RunJob technique described in the next section, whereas the Arabic OCR engine uses multiple IOcrEngines created using the Thunk Server.

The x86 and x64 C# and VB demo source code can be found at the following locations:


             [LEADTOOLS Installation Folder]\Examples\DotNet\CS\OcrMultiThreadingDemo


             [LEADTOOLS Installation Folder]\Examples\DotNet\VB\OcrMultiThreadingDemo

Multi-threaded OCR Applications Using Multiple Documents

The IOcrDocument is a fully contained object that is used to load source images such as TIF and raster PDF, then zoning, recognizing and exporting them to a PDF, DOC, DOCX(2007/2010), HTML or TXT format. Thus, another way to achieve multi-threading in an OCR application is to create one instance of IOcrEngine in the main thread. Then, queue work items in dedicated threads with each using its own IOcrDocument instance. IOcrAutoRecognizeManager is a helper interface that creates IOcrDocuments internally and can be used in the same way as described above.

Since only one instance of IOcrEngine will be created at any time, the Thunk Server is not needed and no performance penalty occurs.

Not all OCR engine types support this scenario. They are listed in the table below:

OCR Engine Type	Platform	Multi-Document supported
Advantage	x86/x64	Yes, with unlimited number of documents at the same time.
Professional	x86	Yes, with up to 64 documents at the same time.
Professional	x64	Yes, with up to 64 documents at the same time
Arabic	x86/x64	No

The OCR Multi-threaded Demo source code shipping with LEADTOOLS shows an example of this scenario in the following cases:

In the x86 version with the Professional or Advantage OCR engines

Sample Applications

The following lists sample applications and then recommends a way to achieve thread-safety and process integrity when using LEADTOOLS OCR engines.

OCR HTTP Web Service Version 1

An HTTP Web Service application generally runs in a session-less mode. Resources cannot be shared between multiple connections.

Create a Web Service application and add the following method:


             
             [WebMethod]
             public void Recognize(string imageFileName, DocumentFormat format, string documentFileName)
             {
                // Unlock support
                string MY_LICENSE_FILE = "d:\\temp\\TestLic.lic";
                string MY_DEVELOPER_KEY = "xyz123abc";
                RasterSupport.SetLicense(MY_LICENSE_FILE, MY_DEVELOPER_KEY);
                using(IOcrEngine ocrEngine = OcrEngineManager.CreateEngine(OcrEngineType.[EngineTypeHere], false))
                {
                   // Start it
                   ocrEngine.Startup(null, null, null, @"C:\LEADTOOLS 19\Bin\Common\OcrAdvantageRuntime");
                   // Recognize
                   IOcrAutoRecognizeJob ocrJob = ocrEngine.AutoRecognizeManager.CreateJob(
                      new OcrAutoRecognizeJobData(imageFileName, format, documentFileName));
                   ocrEngine.AutoRecognizeManager.RunJob(ocrJob);
                }
             }

In this version, we created a web method in an HTTP Web Service to recognize an input image file and output a document file in a specific format. The IOcrEngine has useThunkServer set to true to ensure thread-safety. The code above will work for all engines and on all platforms.

Engines and Platforms Supported: All.

Pros: Easy to implement.

Cons: Performance hit from marshalling using the Thunk Server. Process memory and resources are shared between all connections.

OCR HTTP Web Service Version 2

Create an x86 console application (MyOcrRecognize.exe) that performs OCR:


             static void Main(string[] args)
             {
                // Get the parameters
                string imageFileName = args[0];
                DocumentFormat format = (DocumentFormat)Enum.Parse(typeof(DocumentFormat), args[1]);
                string documentFileName = args[2];
            
                // Unlock support
                string MY_LICENSE_FILE = "d:\\temp\\TestLic.lic";
                string MY_DEVELOPER_KEY = "xyz123abc";
                RasterSupport.SetLicense(MY_LICENSE_FILE, MY_DEVELOPER_KEY);
                // Create the engine without Thunk Server
                using(IOcrEngine ocrEngine = OcrEngineManager.CreateEngine(OcrEngineType.[EngineTypeHere], false))
                {
                   // Start it
                   ocrEngine.Startup(null, null, null, @"C:\LEADTOOLS 19\Bin\Common\OcrAdvantageRuntime");
                   // Recognize
                   IOcrAutoRecognizeJob ocrJob = ocrEngine.AutoRecognizeManager.CreateJob(
                      new OcrAutoRecognizeJobData(imageFileName, format, documentFileName));
                   ocrEngine.AutoRecognizeManager.RunJob(ocrJob);
                }
             }

Create a Web Service application and add the following method:


             [WebMethod]
             public void Recognize(string imageFileName, DocumentFormat format, string documentFileName)
             {
                // Call our OCR console app
                string arguments = "\"" + imageFileName + "\"" + format.ToString() + "\"" + documentFileName + "\"";
                Process.Start("MyOcrRecognize.exe", arguments);
             }

In this version, we create two applications:

An x86 console application that creates an IOcrEngine and performs OCR without using the Thunk Server since each instance of this application will run in its own process. Options are passed through the standard command line.

And a web method in an HTTP Web Service that creates a new instance of the OCR application for each request.

Engines and Platforms Supported: All.

Pros: No performance hit from marshalling using the Thunk Server. Complete process separation and safety. Each connection uses its own dedicated process to OCR.

Cons: Performance hit from creating and destroying processes. More complex to implement than the first version.

OCR Windows Service or Server Version 1

A Windows Service or Server generally runs in a session-enabled mode. Resources can be shared between multiple connections. The server will usually create a thread to process each connection.

Typically, a Windows Service or Server has the following methods: StartServer, ProcessRequest and StopServer.

This is the first version implementation of the service/server:


             void StartServer()
             {
                // Unlock support once here
                string MY_LICENSE_FILE = "d:\\temp\\TestLic.lic";
                string MY_DEVELOPER_KEY = "xyz123abc";
                RasterSupport.SetLicense(MY_LICENSE_FILE, MY_DEVELOPER_KEY);
             }
            
             void StopServer()
             {
             }
            
             void ProcessRequest(string imageFileName, DocumentFormat format, string documentFileName)
             {
                // Queue the work
                ThreadPool.QueueUserWorkItem(delegate(object o)
                {
                   using(IOcrEngine ocrEngine = OcrEngineManager.CreateEngine([EngineTypeHere], false))
                   {
                      // Start it
                      ocrEngine.Startup(null, null, null, @"C:\LEADTOOLS 19\Bin\Common\OcrAdvantageRuntime");
                      // Recognize
                      IOcrAutoRecognizeJob ocrJob = ocrEngine.AutoRecognizeManager.CreateJob(
                         new OcrAutoRecognizeJobData(imageFileName, format, documentFileName));
                      ocrEngine.AutoRecognizeManager.RunJob(ocrJob);
                   }
                });
             }

Engines and Platforms Supported: All.

Pros: Easy to implement.

Cons: Performance hit from marshalling using the Thunk Server. Process memory and resources are shared between all connections.

OCR Windows Service or Server Version 2

In version 2, we will re-use MyOcrRecognize.exe from earlier to perform the OCR operation in a separate process:


             void StartServer()
             {
             }
            
             void StopServer()
             {
             }
            
             void ProcessRequest(string imageFileName, DocumentFormat format, string documentFileName)
             {
                // Queue the work
                ThreadPool.QueueUserWorkItem(delegate(object o)
                {
                   // Call our OCR console app
                   string arguments = "\"" + imageFileName + "\"" + format.ToString() + "\"" + documentFileName + "\"";
                   Process.Start("MyOcrRecognize.exe", arguments);
                });
             }

Engines and Platforms Supported: All.

Pros: No performance hit from marshalling using the Thunk Server. Complete process separation and safety. Each connection uses its own dedicated process to OCR.

Cons: Performance hit from creating and destroying processes. More complex to implement than the first version.

OCR Windows Service or Server Version 3

In version 3, we will use the multi-document capabilities of the engine (if supported) to perform true multi-threading:


             // Shared instance of IOcrEngine
             private IOcrEngine ocrEngine;
            
             void StartServer()
             {
                // Unlock support
                string MY_LICENSE_FILE = "d:\\temp\\TestLic.lic";
                string MY_DEVELOPER_KEY = "xyz123abc";
                RasterSupport.SetLicense(MY_LICENSE_FILE, MY_DEVELOPER_KEY);
                // Start the OCR engine without using the Thunk Server
                ocrEngine = OcrEngineManager.CreateEngine([EngineTypeHere], false));
                // Start it
                ocrEngine.Startup(null, null, null, @"C:\LEADTOOLS 19\Bin\Common\OcrAdvantageRuntime");
             }
            
             void StopServer()
             {
                // Stop the OCR engine
                ocrEngine.Dispose();
             }
            
             void ProcessRequest(string imageFileName, DocumentFormat format, string documentFileName)
             {
                // Queue the work
                ThreadPool.QueueUserWorkItem(delegate(object o)
                {
                   // Recognize
                   IOcrAutoRecognizeJob ocrJob = ocrEngine.AutoRecognizeManager.CreateJob(
                      new OcrAutoRecognizeJobData(imageFileName, format, documentFileName));
                   ocrEngine.AutoRecognizeManager.RunJob(ocrJob);
                });
             }

Engines and Platforms Supported: Advantage x86 and x64, Professional x86.

Pros: No performance hit from marshaling using the Thunk Server. True multi-threading.

Cons: Not supported by all OCR engines and platforms. Restriction on number of recognition operations in some engines (64 in Professional).

Reference

Introduction
Getting Started (Guide to Example Programs)
LEADTOOLS OCR .NET Assemblies
Programming with LEADTOOLS .NET OCR
An Overview of OCR Recognition Modules
Creating an OCR Engine Instance
Starting and Shutting Down the OCR Engine
OCR Spell Language Dictionaries
Working with OCR Languages
Working with OCR Pages
Working with OCR Zones
Recognizing OCR Pages
OCR Confidence Reporting
Using OMR in LEADTOOLS .NET OCR
OCR Languages and Spell Checking
OCR Engine-Specific Settings
OCR Tutorial - Working with Pages
OCR Tutorial - Recognizing Pages
OCR Tutorial - Adding and Painting Zones
OCR Tutorial - Working with Recognition Results
OCR Tutorial - Scanning to Searchable PDF