This tutorial shows how to create a master set of forms, recognize, and process a form in a Python application using the LEADTOOLS SDK.
Overview | |
---|---|
Summary | This tutorial covers how to recognize and process a form using low-level Forms Recognition and Processing in a Python Console application. |
Completion Time | 30 minutes |
Visual Studio Project | Download tutorial project (2 KB) |
Platform | Python Console Application |
IDE | Visual Studio 2022 |
Runtime Target | Python 3.10 or higher |
Development License | Download LEADTOOLS |
Try it in another language |
|
Get familiar with the basic steps of creating a project by reviewing the Add References and Set a License tutorial, before working on the Manually Recognize and Process a Form - Python tutorial.
Start with a copy of the project created in the Add References and Set a License for Python topic.
If you do not have that project, follow the steps in the relevant tutorial to create it.
The references needed depend upon the purpose of the project.
This tutorial requires the following .NET DLLs:
Leadtools.dll
Leadtools.Codecs.dll
Leadtools.Ocr.dll
Leadtools.Forms.Common.dll
Leadtools.Forms.Recognition.dll
Leadtools.Forms.Recognition.Ocr.dll
Leadtools.Forms.Processing.dll
For a complete list of which DLL files are required for your application, refer to Files to be Included With Your Application.
The License unlocks the features needed for the project. It must be set before any toolkit function is called. For details, including tutorials for different platforms, refer to Setting a Runtime License.
There are two types of runtime licenses:
With the project created, the references added, and the license set, coding can begin.
In the Solution Explorer, open Project-Name.py
and place the following references below the "Add references to LEADTOOLS" comment
# Add references to LEADTOOLS
from leadtools import LibraryLoader
LibraryLoader.add_reference("Leadtools")
from Leadtools import *
LibraryLoader.add_reference("Leadtools.Codecs")
from Leadtools.Codecs import *
LibraryLoader.add_reference("Leadtools.Ocr")
from Leadtools.Ocr import *
LibraryLoader.add_reference("Leadtools.Forms.Common")
from Leadtools.Forms.Common import *
LibraryLoader.add_reference("Leadtools.Forms.Recognition")
from Leadtools.Forms.Recognition import *
LibraryLoader.add_reference("Leadtools.Forms.Recognition.Ocr")
from Leadtools.Forms.Recognition.Ocr import *
LibraryLoader.add_reference("Leadtools.Forms.Processing")
from Leadtools.Forms.Processing import *
from System import *
from System.IO import *
from System.Collections.Generic import *
Add a new method to the Project-Name.py
file named init_forms_engines()
. Call the init_forms_engines()
method inside the main()
method below the set license call, as shown below.
def main():
Support.set_license(os.path.join(DemosTools.get_root(), "C:/LEADTOOLS23/Support/Common/License"))
init_forms_engines()
create_master_form_attributes()
recognize_form()
Add the code below to the init_forms_engines()
method to initialize the FormRecognitionEngine
, FormProcessingEngine
, RasterCodecs
, and IOcrEngine
objects.
def init_forms_engines():
print("Initializing Engines")
global codecs
codecs = RasterCodecs()
global recognition_engine
recognition_engine = FormRecognitionEngine()
global processing_engine
processing_engine = FormProcessingEngine()
global forms_ocr_engine
forms_ocr_engine = OcrEngineManager.CreateEngine(OcrEngineType.LEAD)
forms_ocr_engine.Startup(codecs, None, None, r"C:\LEADTOOLS23\Bin\Common\OcrLEADRuntime")
ocr_objects_manager = OcrObjectsManager(forms_ocr_engine)
ocr_objects_manager.Engine = forms_ocr_engine
recognition_engine.ObjectsManagers.Add(ocr_objects_manager)
print("Engines initialized successfully")
In the Project-Name.py
file add two new methods named create_master_form_attributes()
and recognize_form()
. Both of these new methods will be called inside the main()
method, below the init_forms_engines()
method, as shown above. Ensure that the create_master_form_attributes()
method is called above the recognize_form()
method, as the attributes for the master forms need to be created before any filled forms can be recognized.
Add the code below to the create_master_form_attributes()
method to create the master forms attributes and add the master forms to the FormsRecognitionEngine
object.
def create_master_form_attributes():
print("Processing Master Form")
master_file_names = Directory.GetFiles(r"C:\LEADTOOLS23\Resources\Images\Forms\MasterForm Sets\OCR", "*.tif", SearchOption.AllDirectories)
for master_file_name in master_file_names:
form_name = Path.GetFileNameWithoutExtension(master_file_name)
image = codecs.Load(master_file_name, 0, CodecsLoadByteOrder.BgrOrGray, 1, -1)
master_form_attributes = recognition_engine.CreateMasterForm(form_name, Guid.Empty, None)
for i in range(image.PageCount):
image.Page = i + 1
recognition_engine.AddMasterFormPage(master_form_attributes, image, None)
recognition_engine.CloseMasterForm(master_form_attributes)
File.WriteAllBytes(form_name + ".bin", master_form_attributes.GetData())
print("Master Form Processing Complete")
print("=============================================================")
Add the code below to the recognize_form()
method to recognize the filled form from one of the set master forms.
def recognize_form():
print("Recognizing Form\n")
get_project_directory = Directory.GetCurrentDirectory()
form_to_recognize = r"C:\LEADTOOLS23\Resources\Images\Forms\Forms to be Recognized\OCR\W9_OCR_Filled.tif"
image = codecs.Load(form_to_recognize, 0, CodecsLoadByteOrder.BgrOrGray, 1, -1)
filled_form_attributes = recognition_engine.CreateForm(None)
for i in range(image.PageCount):
image.Page = i + 1
recognition_engine.AddFormPage(filled_form_attributes, image, None)
recognition_engine.CloseForm(filled_form_attributes)
result_message = "The form could not be recognized"
master_file_names = Directory.GetFiles(get_project_directory, "*.bin")
for master_file_name in master_file_names:
fields_filename = Path.GetFileNameWithoutExtension(master_file_name) + ".xml"
fields_full_path = Path.Combine(r"C:\LEADTOOLS23\Resources\Images\Forms\MasterForm Sets\OCR", fields_filename)
processing_engine.LoadFields(fields_full_path)
master_form_attributes = FormRecognitionAttributes()
master_form_attributes.SetData(File.ReadAllBytes(master_file_name))
recognition_result = recognition_engine.CompareForm(master_form_attributes, filled_form_attributes, None)
if (recognition_result.Confidence >= 80):
alignment = List[PageAlignment]()
for k in range(recognition_result.PageResults.Count):
alignment.Add(recognition_result.PageResults[k].Alignment)
result_message = f"This form has been recognized as a {Path.GetFileNameWithoutExtension(master_file_name)}"
process_form(image, alignment)
break
print(result_message)
print("=============================================================\n")
Note: Shipped and installed with the LEADTOOLS SDK are sample master form sets and sample filled forms for recognition and processing. This tutorial uses these samples. The sample files are installed at
<INSTALL_DIR>\LEADTOOLS23\Resources\Images\Forms
.
Alternatively, you can load in the file using memory stream. To do this, add the following code into the recognize_form()
method just above the codecs.Load()
method:
buffer = File.ReadAllBytes(form_to_recognize)
ms = MemoryStream(buffer)
codecs.Load()
method
image = codecs.Load(ms, 0, CodecsLoadByteOrder.BgrOrGray, 1, -1)
In the Project-Name.py
file add a new method, process_form(image, alignment)
. This method is called inside the recognize_form()
method shown in the previous step.
def process_form(image, alignment):
processing_engine.OcrEngine = forms_ocr_engine
results_message = ""
processing_engine.Process(image, alignment)
for form_page in processing_engine.Pages:
for field in form_page:
if (field != None):
results_message = f"{results_message}{field.Name} = {(field.Result).Text}\n"
if (results_message == ""):
print("No fields were processed")
else:
print(results_message)
In the main()
method, add forms_ocr_engine.Shutdown()
code under the recognize_form()
method to properly shut down the OCR engine. The main()
method should now look like:
def main():
Support.set_license("C:/LEADTOOLS23/Support/Common/License")
init_forms_engines()
create_master_form_attributes()
recognize_form()
if (forms_ocr_engine != None and forms_ocr_engine.IsStarted):
forms_ocr_engine.Shutdown()
Run the project by pressing F5, or by selecting Debug -> Start Debugging.
If the steps were followed correctly, the console appears and the application displays the recognized form along with the processed fields. For this example, a W9 form is used with 6 filled fields: Business Name, Address, City, State, Zip, and Name.
This tutorial showed how to create a set of attributes from master forms with the FormRecognitionAttributes
class, recognize a form using the FormRecognitionEngine
class, and process the forms fields and display the results to the console using the FormProcessingEngine
class.