This tutorial shows how to create a master set of forms, recognize, and process a form using LEADTOOLS Low-Level Form Interface.
Before any functionality from the SDK can be leveraged, a valid runtime license will have to be set. For instructions on how to obtain a runtime license refer to Obtaining a License.
Overview | |
---|---|
Summary | This tutorial covers how to recognize and process a form in a C# Console application. |
Completion Time | 30 minutes |
Visual Studio Project | Download tutorial project (4 KB) |
Platform | Windows C# Console Application |
IDE | Visual Studio 2017, 2019 |
Development License | Download LEADTOOLS |
Try it in another language |
Get familiar with the basic steps of creating a project by reviewing the Add References and Set a License tutorial, before working on the Manually Recognize and Process a Form - Console C# tutorial.
In Visual Studio, create a new C# Windows Console project, and add the following necessary LEADTOOLS references.
The references needed depend upon the purpose of the project. References can be added by one or the other of the following two methods (but not both). For this project, the following references are needed:
If using NuGet references, this tutorial requires the following NuGet package:
Leadtools.Document.Sdk
If local DLL references are used, the following DLLs are needed. The DLLs are located at <INSTALL_DIR>\LEADTOOLS21\Bin\Dotnet4\x64
:
Leadtools.Codecs.dll
Leadtools.Codecs.Tif.dll
Leadtools.Codecs.Fax.dll
Leadtools.dll
Leadtools.Document.Writer.dll
Leadtools.Forms.Common.dll
Leadtools.Forms.Processing.dll
Leadtools.Forms.Recognition.dll
Leadtools.Forms.Recognition.Ocr.dll
Leadtools.Ocr.dll
Leadtools.Ocr.LEADEngine.dll
For a complete list of which Codec DLLs are required for specific formats, refer to File Format Support.
The License unlocks the features needed for the project. It must be set before any toolkit function is called. For details, including tutorials for different platforms, refer to Setting a Runtime License.
There are two types of runtime licenses:
Note
Adding LEADTOOLS NuGet and local references and setting a license are covered in more detail in the Add References and Set a License tutorial.
With the project created, the references added, and the license set, coding can begin.
In the Solution Explorer, open Program.cs
. Add the following statements to the using block at the top.
// Using block at the top
using System;
using System.IO;
using System.Collections.Generic;
using Leadtools;
using Leadtools.Codecs;
using Leadtools.Forms.Common;
using Leadtools.Ocr;
using Leadtools.Forms.Recognition;
using Leadtools.Forms.Recognition.Ocr;
using Leadtools.Forms.Processing;
Add a new method called InitFormsEngines()
and call it inside the Main
method. Add the below code to initialize the FormRecognitionEngine, RasterCodes, IOcrEngine and FormProcessingEngine.
// Add these global members in the class
private static FormRecognitionEngine recognitionEngine;
private static RasterCodecs codecs;
private static IOcrEngine formsOCREngine;
private static FormProcessingEngine processingEngine;
static void Main(string[] args)
{
SetLicense();
InitFormsEngines();
}
static void InitFormsEngines()
{
Console.WriteLine("Initializing Engines");
codecs = new RasterCodecs();
recognitionEngine = new FormRecognitionEngine();
processingEngine = new FormProcessingEngine();
formsOCREngine = OcrEngineManager.CreateEngine(OcrEngineType.LEAD);
formsOCREngine.Startup(codecs, null, null, @"C:\LEADTOOLS21\Bin\Common\OcrLEADRuntime");
OcrObjectsManager ocrObjectsManager = new OcrObjectsManager(formsOCREngine);
ocrObjectsManager.Engine = formsOCREngine;
recognitionEngine.ObjectsManagers.Add(ocrObjectsManager);
Console.WriteLine("Engines initialized successfully");
}
In the Program class add two new methods called CreateMasterFormAttributes()
and RecognizeForm()
.
private static void CreateMasterFormAttributes()
{
Console.WriteLine("Processing Master Form");
string[] masterFileNames = Directory.GetFiles(@"C:\LEADTOOLS21\Resources\Images\Forms\MasterForm Sets\OCR", "*.tif", SearchOption.AllDirectories);
foreach (string masterFileName in masterFileNames)
{
string formName = Path.GetFileNameWithoutExtension(masterFileName);
using (RasterImage image = codecs.Load(masterFileName, 0, CodecsLoadByteOrder.BgrOrGray, 1, -1))
{
FormRecognitionAttributes masterFormAttributes = recognitionEngine.CreateMasterForm(formName, Guid.Empty, null);
for (int i = 0; i < image.PageCount; i++)
{
image.Page = i + 1;
recognitionEngine.AddMasterFormPage(masterFormAttributes, image, null);
}
recognitionEngine.CloseMasterForm(masterFormAttributes);
File.WriteAllBytes(formName + ".bin", masterFormAttributes.GetData());
}
}
Console.WriteLine("Master Form Processing Complete");
Console.WriteLine("=============================================================");
}
private static void RecognizeForm()
{
Console.WriteLine("Recognizing Form\n");
var GetProjectDirectory = Path.GetDirectoryName(System.Reflection.Assembly.GetExecutingAssembly().Location);
string formToRecognize = @"C:\LEADTOOLS21\Resources\Images\Forms\Forms to be Recognized\OCR\W9_OCR_Filled.tif";
using (RasterImage image = codecs.Load(formToRecognize, 0, CodecsLoadByteOrder.BgrOrGray, 1, -1))
{
FormRecognitionAttributes filledFormAttributes = recognitionEngine.CreateForm(null);
for (int i = 0; i < image.PageCount; i++)
{
image.Page = i + 1;
recognitionEngine.AddFormPage(filledFormAttributes, image, null);
}
recognitionEngine.CloseForm(filledFormAttributes);
string resultMessage = "The form could not be recognized";
string[] masterFileNames = Directory.GetFiles(GetProjectDirectory, "*.bin");
foreach (string masterFileName in masterFileNames)
{
string fieldsfName = Path.GetFileNameWithoutExtension(masterFileName) + ".xml";
string fieldsfullPath = Path.Combine(@"C:\LEADTOOLS21\Resources\Images\Forms\MasterForm Sets\OCR", fieldsfName);
processingEngine.LoadFields(fieldsfullPath);
FormRecognitionAttributes masterFormAttributes = new FormRecognitionAttributes();
masterFormAttributes.SetData(File.ReadAllBytes(masterFileName));
FormRecognitionResult recognitionResult = recognitionEngine.CompareForm(masterFormAttributes, filledFormAttributes, null);
if (recognitionResult.Confidence >= 80)
{
List<PageAlignment> alignment = new List<PageAlignment>();
for (int k = 0; k < recognitionResult.PageResults.Count; k++)
alignment.Add(recognitionResult.PageResults[k].Alignment);
resultMessage = $"This form has been recognized as a {Path.GetFileNameWithoutExtension(masterFileName)}";
ProcessForm(image, alignment);
break;
}
}
Console.WriteLine(resultMessage, "Recognition Results");
Console.WriteLine("=============================================================\n");
}
}
Call both methods in the Main method after the InitFormsEngines()
method.
// call these in Main() after InitFormsEngines();
CreateMasterFormAttributes();
RecognizeForm();
Shipped with the LEADTOOLS SDK are sample master form sets and sample filled forms for recognition and processing. This tutorial will utilize these samples. After installation the sample files can be found here: <INSTALL_DIR>\LEADTOOLS21\Resources\Images\Forms
In the Program class add a new method, ProcessForm(RasterImage image,List<PageAlignment> alignment)
. This method is called in RecognizeForm()
from the previous step.
private static void ProcessForm(RasterImage image, List<PageAlignment> alignment)
{
processingEngine.OcrEngine = formsOCREngine;
string resultsMessage = string.Empty;
processingEngine.Process(image, alignment);
foreach (FormPage formPage in processingEngine.Pages)
foreach (FormField field in formPage)
if (field != null)
resultsMessage = $"{resultsMessage}{field.Name} = {(field.Result as TextFormFieldResult).Text}\n";
if (string.IsNullOrEmpty(resultsMessage))
Console.WriteLine("No fields were processed", "FieldProcessing Results");
else
Console.WriteLine(resultsMessage, "Field ProcessingResults");
}
In the Main method add the below code under the RecognizeForm()
method to properly shutdown the OCR engine and dispose of the IOcrEngine
.
if (formsOCREngine != null && formsOCREngine.IsStarted)
formsOCREngine.Shutdown();
Run the project by pressing F5, or by selecting Debug -> Start Debugging.
If the steps were followed correctly, the application runs and displays the recognized form.
For this example, a W9 form is used with 6 filled fields: Business Name, Address, City, State, Zip, and Name. All of these fields were correctly detected and labeled, as well as the form being correctly recognized.
This tutorial showed how to create a set of attributes from master forms with FormRecognitionAttributes
class and recognize a form using the FormRecognitionEngine
class.