In This Topic ▼

Manually Recognize and Process a Form - Python

This tutorial shows how to create a master set of forms, recognize, and process a form in a Python application using the LEADTOOLS SDK.

Overview
Summary	This tutorial covers how to recognize and process a form using low-level Forms Recognition and Processing in a Python Console application.
Completion Time	30 minutes
Visual Studio Project	Download tutorial project (2 KB)
Platform	Python Console Application
IDE	Visual Studio 2022
Runtime Target	Python 3.10 or higher
Development License	Download LEADTOOLS
Try it in another language	C#: .NET 6+ (Console) Java: Java Python: Python

Required Knowledge

Get familiar with the basic steps of creating a project by reviewing the Add References and Set a License tutorial, before working on the Manually Recognize and Process a Form - Python tutorial.

Create the Project and Add LEADTOOLS References

Start with a copy of the project created in the Add References and Set a License for Python topic.

If you do not have that project, follow the steps in the relevant tutorial to create it.

The references needed depend upon the purpose of the project.

This tutorial requires the following .NET DLLs:

Leadtools.dll
Leadtools.Codecs.dll
Leadtools.Ocr.dll
Leadtools.Forms.Common.dll
Leadtools.Forms.Recognition.dll
Leadtools.Forms.Recognition.Ocr.dll
Leadtools.Forms.Processing.dll

For a complete list of which DLL files are required for your application, refer to Files to be Included With Your Application.

Set the License File

The License unlocks the features needed for the project. It must be set before any toolkit function is called. For details, including tutorials for different platforms, refer to Setting a Runtime License.

There are two types of runtime licenses:

Evaluation license, obtained at the time the evaluation toolkit is downloaded. It allows the toolkit to be evaluated.
Deployment license. If a Deployment license file and developer key are needed, refer to Obtaining a License.

Initialize FormRecognitionEngine, RasterCodecs, IOcrEngine, and FormProcessingEngine

With the project created, the references added, and the license set, coding can begin.

In the Solution Explorer, open Project-Name.py and place the following references below the "Add references to LEADTOOLS" comment

# Add references to LEADTOOLS 
from leadtools import LibraryLoader 
LibraryLoader.add_reference("Leadtools") 
from Leadtools import * 
LibraryLoader.add_reference("Leadtools.Codecs") 
from Leadtools.Codecs import * 
LibraryLoader.add_reference("Leadtools.Ocr") 
from Leadtools.Ocr import * 
LibraryLoader.add_reference("Leadtools.Forms.Common") 
from Leadtools.Forms.Common import * 
LibraryLoader.add_reference("Leadtools.Forms.Recognition") 
from Leadtools.Forms.Recognition import * 
LibraryLoader.add_reference("Leadtools.Forms.Recognition.Ocr") 
from Leadtools.Forms.Recognition.Ocr import * 
LibraryLoader.add_reference("Leadtools.Forms.Processing") 
from Leadtools.Forms.Processing import * 
 
from System import * 
from System.IO import * 
from System.Collections.Generic import *

Add a new method to the Project-Name.py file named init_forms_engines(). Call the init_forms_engines() method inside the main() method below the set license call, as shown below.

def main(): 
 
    Support.set_license(os.path.join(DemosTools.get_root(), "C:/LEADTOOLS22/Support/Common/License")) 
     
    init_forms_engines() 
    create_master_form_attributes() 
    recognize_form()

Add the code below to the init_forms_engines() method to initialize the FormRecognitionEngine, FormProcessingEngine, RasterCodecs, and IOcrEngine objects.

def init_forms_engines(): 
 
    print("Initializing Engines") 
    global codecs 
    codecs = RasterCodecs() 
    global recognition_engine 
    recognition_engine = FormRecognitionEngine() 
    global processing_engine 
    processing_engine = FormProcessingEngine() 
    global forms_ocr_engine 
    forms_ocr_engine = OcrEngineManager.CreateEngine(OcrEngineType.LEAD) 
    forms_ocr_engine.Startup(codecs, None, None, r"C:\LEADTOOLS22\Bin\Common\OcrLEADRuntime") 
    ocr_objects_manager = OcrObjectsManager(forms_ocr_engine) 
    ocr_objects_manager.Engine = forms_ocr_engine 
    recognition_engine.ObjectsManagers.Add(ocr_objects_manager) 
 
    print("Engines initialized successfully")

Add the Create Master Form Attributes Code

In the Project-Name.py file add two new methods named create_master_form_attributes() and recognize_form(). Both of these new methods will be called inside the main() method, below the init_forms_engines() method, as shown above. Ensure that the create_master_form_attributes() method is called above the recognize_form() method, as the attributes for the master forms need to be created before any filled forms can be recognized.

Add the code below to the create_master_form_attributes() method to create the master forms attributes and add the master forms to the FormsRecognitionEngine object.

def create_master_form_attributes(): 
 
    print("Processing Master Form") 
    master_file_names = Directory.GetFiles(r"C:\LEADTOOLS22\Resources\Images\Forms\MasterForm Sets\OCR", "*.tif", SearchOption.AllDirectories) 
    for master_file_name in master_file_names: 
        form_name = Path.GetFileNameWithoutExtension(master_file_name) 
        image = codecs.Load(master_file_name, 0, CodecsLoadByteOrder.BgrOrGray, 1, -1) 
        master_form_attributes = recognition_engine.CreateMasterForm(form_name, Guid.Empty, None) 
        for i in range(image.PageCount): 
            image.Page = i + 1 
            recognition_engine.AddMasterFormPage(master_form_attributes, image, None) 
         
        recognition_engine.CloseMasterForm(master_form_attributes) 
        File.WriteAllBytes(form_name + ".bin", master_form_attributes.GetData()) 
    print("Master Form Processing Complete") 
    print("=============================================================")

Add the code below to the recognize_form() method to recognize the filled form from one of the set master forms.

def recognize_form(): 
    print("Recognizing Form\n") 
    get_project_directory = Directory.GetCurrentDirectory() 
    form_to_recognize = r"C:\LEADTOOLS22\Resources\Images\Forms\Forms to be Recognized\OCR\W9_OCR_Filled.tif" 
 
    image = codecs.Load(form_to_recognize, 0, CodecsLoadByteOrder.BgrOrGray, 1, -1) 
    filled_form_attributes = recognition_engine.CreateForm(None) 
 
    for i in range(image.PageCount): 
        image.Page = i + 1 
        recognition_engine.AddFormPage(filled_form_attributes, image, None) 
 
    recognition_engine.CloseForm(filled_form_attributes) 
 
    result_message = "The form could not be recognized" 
    master_file_names = Directory.GetFiles(get_project_directory, "*.bin") 
 
    for master_file_name in master_file_names: 
        fields_filename = Path.GetFileNameWithoutExtension(master_file_name) + ".xml" 
        fields_full_path = Path.Combine(r"C:\LEADTOOLS22\Resources\Images\Forms\MasterForm Sets\OCR", fields_filename) 
        processing_engine.LoadFields(fields_full_path) 
        master_form_attributes = FormRecognitionAttributes() 
        master_form_attributes.SetData(File.ReadAllBytes(master_file_name)) 
        recognition_result = recognition_engine.CompareForm(master_form_attributes, filled_form_attributes, None) 
 
        if (recognition_result.Confidence >= 80): 
            alignment = List[PageAlignment]() 
            for k in range(recognition_result.PageResults.Count): 
                alignment.Add(recognition_result.PageResults[k].Alignment) 
 
            result_message = f"This form has been recognized as a {Path.GetFileNameWithoutExtension(master_file_name)}" 
            process_form(image, alignment) 
            break 
    print(result_message) 
    print("=============================================================\n")

Note

Shipped and installed with the LEADTOOLS SDK are sample master form sets and sample filled forms for recognition and processing. This tutorial uses these samples. The sample files are installed at <INSTALL_DIR>\LEADTOOLS22\Resources\Images\Forms.

Handling streams

Alternatively, you can load in the file using memory stream. To do this, add the following code into the recognize_form() method just above the codecs.Load() method:

buffer = File.ReadAllBytes(form_to_recognize) 
ms = MemoryStream(buffer)

Make sure to pass in the memory stream to the codecs.Load() method

image = codecs.Load(ms, 0, CodecsLoadByteOrder.BgrOrGray, 1, -1)

Add the Process Form Code

In the Project-Name.py file add a new method, process_form(image, alignment). This method is called inside the recognize_form() method shown in the previous step.

def process_form(image, alignment): 
 
    processing_engine.OcrEngine = forms_ocr_engine 
    results_message = "" 
 
    processing_engine.Process(image, alignment) 
    for form_page in processing_engine.Pages: 
        for field in form_page: 
            if (field != None): 
                results_message = f"{results_message}{field.Name} = {(field.Result).Text}\n" 
    if (results_message == ""): 
        print("No fields were processed") 
    else: 
        print(results_message)

Shutdown the OCR Engine

In the main() method, add forms_ocr_engine.Shutdown() code under the recognize_form() method to properly shut down the OCR engine. The main() method should now look like:

def main(): 
 
    SetLicense() 
     
    init_forms_engines() 
    create_master_form_attributes() 
    recognize_form() 
 
    if (forms_ocr_engine != None and forms_ocr_engine.IsStarted): 
        forms_ocr_engine.Shutdown()

Run the Project

Run the project by pressing F5, or by selecting Debug -> Start Debugging.

If the steps were followed correctly, the console appears and the application displays the recognized form along with the processed fields. For this example, a W9 form is used with 6 filled fields: Business Name, Address, City, State, Zip, and Name.

Recognition and Processing results displayed to the console.

Wrap-up

This tutorial showed how to create a set of attributes from master forms with the FormRecognitionAttributes class, recognize a form using the FormRecognitionEngine class, and process the forms fields and display the results to the console using the FormProcessingEngine class.