Manually Recognize and Process a Form - C# .NET 6

This tutorial shows how to create a master set of forms, recognize, and process a form using LEADTOOLS Low-Level Forms Interface in a C# .NET 6 application.

Overview  
Summary This tutorial covers how to recognize and process a form using low-level Forms Recognition and Processing in a C# .NET 6 Console application.
Completion Time 20 minutes
Visual Studio Project Download tutorial project (2 KB)
Platform C# .NET 6 Console Application
IDE Visual Studio 2022
Runtime Target .NET 6 or higher
Development License Download LEADTOOLS
Try it in another language

Required Knowledge

Get familiar with the basic steps of creating a project by reviewing the Add References and Set a License tutorial, before working on the Manually Recognize and Process a Form - C# .NET 6 tutorial.

Create the Project and Add LEADTOOLS References

Start with a copy of the project created in the Add References and Set a License tutorial. If the project is not available, follow the steps in that tutorial to create it.

The references needed depend upon the purpose of the project. References can be added via NuGet packages.

This tutorial requires the following NuGet package:

For a complete list of which DLL files are required for your application, refer to Files to be Included With Your Application.

Set the License File

The License unlocks the features needed for the project. It must be set before any toolkit function is called. For details, including tutorials for different platforms, refer to Setting a Runtime License.

There are two types of runtime licenses:

Initialize the FormRecognitionEngine, RasterCodes, IOcrEngine, and FormProcessingEngine

With the project created, the references added, and the license set, coding can begin.

In the Solution Explorer, open Program.cs. Add the following statements to the using block at the top of Program.cs.

C#
using System; 
using System.IO; 
using System.Collections.Generic; 
using Leadtools; 
using Leadtools.Codecs; 
using Leadtools.Forms.Common; 
using Leadtools.Ocr; 
using Leadtools.Forms.Recognition; 
using Leadtools.Forms.Recognition.Ocr; 
using Leadtools.Forms.Processing; 

Add the below global variables to the Program class.

C#
private static FormRecognitionEngine recognitionEngine; 
private static RasterCodecs codecs; 
private static IOcrEngine formsOCREngine; 
private static FormProcessingEngine processingEngine; 

Add a new method to the Program class named InitFormsEngines(). Call the InitFormsEngines() method inside the Main() method below the set license call, as shown below.

C#
static void Main(string[] args) 
{ 
    if (!InitLEAD()) 
        Console.WriteLine("Error setting license"); 
    else 
        Console.WriteLine("License file set successfully"); 
    
    InitFormsEngines(); 
    CreateMasterFormAttributes(); 
    RecognizeForm(); 
} 

Add the code below to the InitFormsEngines() method to initialize the FormRecognitionEngine, FormProcessingEngine, RasterCodecs, and IOcrEngine objects.

C#
static void InitFormsEngines() 
{ 
    Console.WriteLine("Initializing Engines"); 
    codecs = new RasterCodecs(); 
    recognitionEngine = new FormRecognitionEngine(); 
    processingEngine = new FormProcessingEngine(); 
    formsOCREngine = OcrEngineManager.CreateEngine(OcrEngineType.LEAD); 
    formsOCREngine.Startup(codecs, null, null, @"C:\LEADTOOLS22\Bin\Common\OcrLEADRuntime"); 
    OcrObjectsManager ocrObjectsManager = new OcrObjectsManager(formsOCREngine); 
    ocrObjectsManager.Engine = formsOCREngine; 
    recognitionEngine.ObjectsManagers.Add(ocrObjectsManager); 
 
    Console.WriteLine("Engines initialized successfully"); 
} 

Add the Create Master Form Attributes Code

In the Program class add two new methods named CreateMasterFormAttributes() and RecognizeForm(). Both of these new methods will be called inside the Main() method, below the InitFormsEngines() method, as shown above. Ensure that the CreateMasterFormAttributes() method is called above the RecognizeForm() method, as the attributes for the master forms need to be created before any filled forms can be recognized.

Add the code below to the CreateMasterFormAttributes() method to create the master forms attributes and add the master forms to the FormsRecognitionEngine object.

C#
private static void CreateMasterFormAttributes() 
{ 
 
    Console.WriteLine("Processing Master Form"); 
    string[] masterFileNames = Directory.GetFiles(@"C:\LEADTOOLS22\Resources\Images\Forms\MasterForm Sets\OCR", "*.tif", SearchOption.AllDirectories); 
 
    foreach (string masterFileName in masterFileNames) 
    { 
        string formName = Path.GetFileNameWithoutExtension(masterFileName); 
        using (RasterImage image = codecs.Load(masterFileName, 0, CodecsLoadByteOrder.BgrOrGray, 1, -1)) 
        { 
            FormRecognitionAttributes masterFormAttributes = recognitionEngine.CreateMasterForm(formName, Guid.Empty, null); 
            for (int i = 0; i < image.PageCount; i++) 
            { 
                image.Page = i + 1; 
                recognitionEngine.AddMasterFormPage(masterFormAttributes, image, null); 
            } 
            recognitionEngine.CloseMasterForm(masterFormAttributes); 
            File.WriteAllBytes(formName + ".bin", masterFormAttributes.GetData()); 
        } 
    } 
    Console.WriteLine("Master Form Processing Complete"); 
    Console.WriteLine("============================================================="); 
} 

Add the code below to the RecognizeForm() method to recognize the filled form from one of the set master forms.

C#
private static void RecognizeForm() 
{ 
    Console.WriteLine("Recognizing Form\n"); 
    var GetProjectDirectory = Path.GetDirectoryName(System.Reflection.Assembly.GetExecutingAssembly().Location); 
    string formToRecognize = @"C:\LEADTOOLS22\Resources\Images\Forms\Forms to be Recognized\OCR\W9_OCR_Filled.tif"; 
    // If you would like to load the file using memory stream, then use the below commented code 
    // byte[] buffer = File.ReadAllBytes(formToRecognize); 
    // MemoryStream ms = new MemoryStream(buffer); 
    // change the below code to read codec.Load(ms) 
    using (RasterImage image = codecs.Load(formToRecognize, 0, CodecsLoadByteOrder.BgrOrGray, 1, -1)) 
    { 
        // Use this console command to double check the file has been loaded 
        // Console.WriteLine("document loaded by stream"); 
        FormRecognitionAttributes filledFormAttributes = recognitionEngine.CreateForm(null); 
 
        for (int i = 0; i < image.PageCount; i++) 
        { 
            image.Page = i + 1; 
            recognitionEngine.AddFormPage(filledFormAttributes, image, null); 
        } 
        recognitionEngine.CloseForm(filledFormAttributes); 
 
        string resultMessage = "The form could not be recognized"; 
        string[] masterFileNames = Directory.GetFiles(GetProjectDirectory, "*.bin"); 
 
        foreach (string masterFileName in masterFileNames) 
        { 
            string fieldsfName = Path.GetFileNameWithoutExtension(masterFileName) + ".xml"; 
            string fieldsfullPath = Path.Combine(@"C:\LEADTOOLS22\Resources\Images\Forms\MasterForm Sets\OCR", fieldsfName); 
            processingEngine.LoadFields(fieldsfullPath); 
            FormRecognitionAttributes masterFormAttributes = new FormRecognitionAttributes(); 
            masterFormAttributes.SetData(File.ReadAllBytes(masterFileName)); 
            FormRecognitionResult recognitionResult = recognitionEngine.CompareForm(masterFormAttributes, filledFormAttributes, null); 
            if (recognitionResult.Confidence >= 80) 
            { 
                List<PageAlignment> alignment = new List<PageAlignment>(); 
                for (int k = 0; k < recognitionResult.PageResults.Count; k++) 
                    alignment.Add(recognitionResult.PageResults[k].Alignment); 
 
                resultMessage = $"This form has been recognized as a {Path.GetFileNameWithoutExtension(masterFileName)}"; 
                ProcessForm(image, alignment); 
                break; 
            } 
        } 
 
        Console.WriteLine(resultMessage, "Recognition Results"); 
        Console.WriteLine("=============================================================\n"); 
    } 
} 

Note

Shipped and installed with the LEADTOOLS SDK are sample master form sets and sample filled forms for recognition and processing. This tutorial uses these samples. The sample files are installed at <INSTALL_DIR>\LEADTOOLS22\Resources\Images\Forms.

Handling streams

Alternatively, you can load in the file using memory stream. To do this, add the following code into the RecognizeForm() method just above the first using statement:

C#
byte[] buffer = File.ReadAllBytes(formToRecognize); 
MemoryStream ms = new MemoryStream(buffer); 
Make sure to also pass in the memory stream to the codecs.Load() method

C#
using(RasterImage image = codecs.Load(ms)){...} 

Add the Process Form Code

In the Program class add a new method, ProcessForm(RasterImage image,List<PageAlignment> alignment). This method is called inside the RecognizeForm() method shown in the previous step.

C#
private static void ProcessForm(RasterImage image, List<PageAlignment> alignment) 
{ 
    processingEngine.OcrEngine = formsOCREngine; 
    string resultsMessage = string.Empty; 
 
    processingEngine.Process(image, alignment); 
    foreach (FormPage formPage in processingEngine.Pages) 
        foreach (FormField field in formPage) 
            if (field != null) 
                resultsMessage = $"{resultsMessage}{field.Name} = {(field.Result as TextFormFieldResult).Text}\n"; 
 
    if (string.IsNullOrEmpty(resultsMessage)) 
        Console.WriteLine("No fields were processed", "FieldProcessing Results"); 
    else 
        Console.WriteLine(resultsMessage, "Field ProcessingResults"); 
} 

Shutdown the OCR Engine

In the Main method, add formsOCREngine.Shutdown() code under the RecognizeForm() method to properly shut down the OCR engine. The Main(string[] args) section should now look like:

C#
static void Main(string[] args) 
{ 
    if (!SetLicense()) 
        Console.WriteLine("Error setting license"); 
    else 
        Console.WriteLine("License file set successfully"); 
    
    InitFormsEngines(); 
    CreateMasterFormAttributes(); 
    RecognizeForm(); 
 
    if (formsOCREngine != null && formsOCREngine.IsStarted) 
        formsOCREngine.Shutdown(); 
} 

Run the Project

Run the project by pressing F5, or by selecting Debug -> Start Debugging.

If the steps were followed correctly, the console appears and the application displays the recognized form along with the processed fields. For this example, a W9 form is used with 6 filled fields: Business Name, Address, City, State, Zip, and Name.

Recognition and Processing results displayed to the console.

Wrap-up

This tutorial showed how to create a set of attributes from master forms with the FormRecognitionAttributes class, recognize a form using the FormRecognitionEngine class, and process the forms fields and display the results to the console using the FormProcessingEngine class.

See Also

Help Version 22.0.2024.3.20
Products | Support | Contact Us | Intellectual Property Notices
© 1991-2023 LEAD Technologies, Inc. All Rights Reserved.

Products | Support | Contact Us | Intellectual Property Notices
© 1991-2023 LEAD Technologies, Inc. All Rights Reserved.