This tutorial shows how to create a master set of forms, recognize, and process a form using LEADTOOLS Low-Level Forms Interface in a C# .NET 6 application.
Overview | |
---|---|
Summary | This tutorial covers how to recognize and process a form using low-level Forms Recognition and Processing in a C# .NET 6 Console application. |
Completion Time | 20 minutes |
Visual Studio Project | Download tutorial project (2 KB) |
Platform | C# .NET 6 Console Application |
IDE | Visual Studio 2022 |
Runtime Target | .NET 6 or higher |
Development License | Download LEADTOOLS |
Try it in another language |
Get familiar with the basic steps of creating a project by reviewing the Add References and Set a License tutorial, before working on the Manually Recognize and Process a Form - C# .NET 6 tutorial.
Start with a copy of the project created in the Add References and Set a License tutorial. If the project is not available, follow the steps in that tutorial to create it.
The references needed depend upon the purpose of the project. References can be added via NuGet packages.
This tutorial requires the following NuGet package:
Leadtools.Document.Sdk
For a complete list of which DLL files are required for your application, refer to Files to be Included With Your Application.
The License unlocks the features needed for the project. It must be set before any toolkit function is called. For details, including tutorials for different platforms, refer to Setting a Runtime License.
There are two types of runtime licenses:
With the project created, the references added, and the license set, coding can begin.
In the Solution Explorer, open Program.cs
. Add the following statements to the using
block at the top of Program.cs
.
using System;
using System.IO;
using System.Collections.Generic;
using Leadtools;
using Leadtools.Codecs;
using Leadtools.Forms.Common;
using Leadtools.Ocr;
using Leadtools.Forms.Recognition;
using Leadtools.Forms.Recognition.Ocr;
using Leadtools.Forms.Processing;
Add the below global variables to the Program
class.
private static FormRecognitionEngine recognitionEngine;
private static RasterCodecs codecs;
private static IOcrEngine formsOCREngine;
private static FormProcessingEngine processingEngine;
Add a new method to the Program
class named InitFormsEngines()
. Call the InitFormsEngines()
method inside the Main()
method below the set license call, as shown below.
static void Main(string[] args)
{
if (!InitLEAD())
Console.WriteLine("Error setting license");
else
Console.WriteLine("License file set successfully");
InitFormsEngines();
CreateMasterFormAttributes();
RecognizeForm();
}
Add the code below to the InitFormsEngines()
method to initialize the FormRecognitionEngine
, FormProcessingEngine
, RasterCodecs
, and IOcrEngine
objects.
static void InitFormsEngines()
{
Console.WriteLine("Initializing Engines");
codecs = new RasterCodecs();
recognitionEngine = new FormRecognitionEngine();
processingEngine = new FormProcessingEngine();
formsOCREngine = OcrEngineManager.CreateEngine(OcrEngineType.LEAD);
formsOCREngine.Startup(codecs, null, null, @"C:\LEADTOOLS23\Bin\Common\OcrLEADRuntime");
OcrObjectsManager ocrObjectsManager = new OcrObjectsManager(formsOCREngine);
ocrObjectsManager.Engine = formsOCREngine;
recognitionEngine.ObjectsManagers.Add(ocrObjectsManager);
Console.WriteLine("Engines initialized successfully");
}
In the Program
class add two new methods named CreateMasterFormAttributes()
and RecognizeForm()
. Both of these new methods will be called inside the Main()
method, below the InitFormsEngines()
method, as shown above. Ensure that the CreateMasterFormAttributes()
method is called above the RecognizeForm()
method, as the attributes for the master forms need to be created before any filled forms can be recognized.
Add the code below to the CreateMasterFormAttributes()
method to create the master forms attributes and add the master forms to the FormsRecognitionEngine
object.
private static void CreateMasterFormAttributes()
{
Console.WriteLine("Processing Master Form");
string[] masterFileNames = Directory.GetFiles(@"C:\LEADTOOLS23\Resources\Images\Forms\MasterForm Sets\OCR", "*.tif", SearchOption.AllDirectories);
foreach (string masterFileName in masterFileNames)
{
string formName = Path.GetFileNameWithoutExtension(masterFileName);
using (RasterImage image = codecs.Load(masterFileName, 0, CodecsLoadByteOrder.BgrOrGray, 1, -1))
{
FormRecognitionAttributes masterFormAttributes = recognitionEngine.CreateMasterForm(formName, Guid.Empty, null);
for (int i = 0; i < image.PageCount; i++)
{
image.Page = i + 1;
recognitionEngine.AddMasterFormPage(masterFormAttributes, image, null);
}
recognitionEngine.CloseMasterForm(masterFormAttributes);
File.WriteAllBytes(formName + ".bin", masterFormAttributes.GetData());
}
}
Console.WriteLine("Master Form Processing Complete");
Console.WriteLine("=============================================================");
}
Add the code below to the RecognizeForm()
method to recognize the filled form from one of the set master forms.
private static void RecognizeForm()
{
Console.WriteLine("Recognizing Form\n");
var GetProjectDirectory = Path.GetDirectoryName(System.Reflection.Assembly.GetExecutingAssembly().Location);
string formToRecognize = @"C:\LEADTOOLS23\Resources\Images\Forms\Forms to be Recognized\OCR\W9_OCR_Filled.tif";
// If you would like to load the file using memory stream, then use the below commented code
// byte[] buffer = File.ReadAllBytes(formToRecognize);
// MemoryStream ms = new MemoryStream(buffer);
// change the below code to read codec.Load(ms)
using (RasterImage image = codecs.Load(formToRecognize, 0, CodecsLoadByteOrder.BgrOrGray, 1, -1))
{
// Use this console command to double check the file has been loaded
// Console.WriteLine("document loaded by stream");
FormRecognitionAttributes filledFormAttributes = recognitionEngine.CreateForm(null);
for (int i = 0; i < image.PageCount; i++)
{
image.Page = i + 1;
recognitionEngine.AddFormPage(filledFormAttributes, image, null);
}
recognitionEngine.CloseForm(filledFormAttributes);
string resultMessage = "The form could not be recognized";
string[] masterFileNames = Directory.GetFiles(GetProjectDirectory, "*.bin");
foreach (string masterFileName in masterFileNames)
{
string fieldsfName = Path.GetFileNameWithoutExtension(masterFileName) + ".xml";
string fieldsfullPath = Path.Combine(@"C:\LEADTOOLS23\Resources\Images\Forms\MasterForm Sets\OCR", fieldsfName);
processingEngine.LoadFields(fieldsfullPath);
FormRecognitionAttributes masterFormAttributes = new FormRecognitionAttributes();
masterFormAttributes.SetData(File.ReadAllBytes(masterFileName));
FormRecognitionResult recognitionResult = recognitionEngine.CompareForm(masterFormAttributes, filledFormAttributes, null);
if (recognitionResult.Confidence >= 80)
{
List<PageAlignment> alignment = new List<PageAlignment>();
for (int k = 0; k < recognitionResult.PageResults.Count; k++)
alignment.Add(recognitionResult.PageResults[k].Alignment);
resultMessage = $"This form has been recognized as a {Path.GetFileNameWithoutExtension(masterFileName)}";
ProcessForm(image, alignment);
break;
}
}
Console.WriteLine(resultMessage, "Recognition Results");
Console.WriteLine("=============================================================\n");
}
}
Note: Shipped and installed with the LEADTOOLS SDK are sample master form sets and sample filled forms for recognition and processing. This tutorial uses these samples. The sample files are installed at
<INSTALL_DIR>\LEADTOOLS23\Resources\Images\Forms
.
Alternatively, you can load in the file using memory stream. To do this, add the following code into the RecognizeForm()
method just above the first using
statement:
byte[] buffer = File.ReadAllBytes(formToRecognize);
MemoryStream ms = new MemoryStream(buffer);
Make sure to also pass in the memory stream to the codecs.Load()
method
using(RasterImage image = codecs.Load(ms)){...}
In the Program
class add a new method, ProcessForm(RasterImage image,List<PageAlignment> alignment)
. This method is called inside the RecognizeForm()
method shown in the previous step.
private static void ProcessForm(RasterImage image, List<PageAlignment> alignment)
{
processingEngine.OcrEngine = formsOCREngine;
string resultsMessage = string.Empty;
processingEngine.Process(image, alignment);
foreach (FormPage formPage in processingEngine.Pages)
foreach (FormField field in formPage)
if (field != null)
resultsMessage = $"{resultsMessage}{field.Name} = {(field.Result as TextFormFieldResult).Text}\n";
if (string.IsNullOrEmpty(resultsMessage))
Console.WriteLine("No fields were processed", "FieldProcessing Results");
else
Console.WriteLine(resultsMessage, "Field ProcessingResults");
}
In the Main
method, add formsOCREngine.Shutdown()
code under the RecognizeForm()
method to properly shut down the OCR engine. The Main(string[] args)
section should now look like:
static void Main(string[] args)
{
if (!SetLicense())
Console.WriteLine("Error setting license");
else
Console.WriteLine("License file set successfully");
InitFormsEngines();
CreateMasterFormAttributes();
RecognizeForm();
if (formsOCREngine != null && formsOCREngine.IsStarted)
formsOCREngine.Shutdown();
}
Run the project by pressing F5, or by selecting Debug -> Start Debugging.
If the steps were followed correctly, the console appears and the application displays the recognized form along with the processed fields. For this example, a W9 form is used with 6 filled fields: Business Name, Address, City, State, Zip, and Name.
This tutorial showed how to create a set of attributes from master forms with the FormRecognitionAttributes
class, recognize a form using the FormRecognitionEngine
class, and process the forms fields and display the results to the console using the FormProcessingEngine
class.