This tutorial shows how to create a C# Windows Console application to perform basic operations using the LEADTOOLS Document Analyzer SDK.
| Overview | |
|---|---|
| Summary | This tutorial shows how to use and perform basic DocumentAnalyzer operations. |
| Completion Time | 30 minutes |
| Visual Studio Project | Download tutorial project (7 KB) |
| Platform | C# Windows Console Application |
| IDE | Visual Studio 2017, 2019 |
| Development License | Download LEADTOOLS |
Before working on the Document Analyzer High-Level Usage - Console C# tutorial, get familiar with the basic steps of creating a project by reviewing the Add References and Set a License tutorial.
Start with a copy of the project created in the Add References and Set a License tutorial. If you do not have that project, follow the steps in that tutorial to create it.
The references needed depend upon the purpose of the project. References can be added by one or the other of the following two methods (but not both). For this project, the following references are needed:
If using NuGet references, this tutorial requires the following NuGet package:
Leadtools.Document.SdkIf using local DLL references, the following DLLs are needed.
The DLLs are located at <INSTALL_DIR>\LEADTOOLS22\Bin\Dotnet4\x64:
Leadtools.dllLeadtools.Codecs.dllLeadtools.Document.dllLeadtools.Document.Analytics.dllLeadtools.Document.Unstructured.dllLeadtools.Ocr.dllLeadtools.Ocr.LEADEngine.dllFor a complete list of which DLL files are required for your application, refer to Files to be Included With Your Application.
The License unlocks the features needed for the project. It must be set before any toolkit function is called. For details, including tutorials for different platforms, refer to Setting a Runtime License.
There are two types of runtime licenses:
Note
Adding LEADTOOLS NuGet and local references and setting a license are covered in more detail in the Add References and Set a License tutorial.
With the project created, the references added, and the license set, coding can begin.
In Solution Explorer, open Program.cs. Add the following statements to the using block at the top of Program.cs:
// Using block at the topusing `<PROJECT_NAME>`.Tutorials;using Leadtools;using System;using System.IO;
Right-click on <PROJECT_NAME>.csproj and select Add -> New Folder. Name the folder Tutorials. This folder will contain six classes showcasing various features of high-level Document Analyzer API. To add a new class to the Tutorials folder, right-click the folder and select Add -> New Item. Select Class and name the class. Add the six classes in the table below.
| Class Name | Description |
|---|---|
| SaveLoad.cs | Create sample features, save to JSON, and load JSON. |
| StandardFeatures.cs | Create a standard date feature. |
| CustomFeatures.cs | Create a custom sample feature. |
| ExcludedFeatures.cs | Find emails with one exclusion. |
| LabeledFeatures.cs | Find emails and add a feature label. |
| ExecuteFeatures.cs | Create an engine to execute features. |
Add the code below to the Main() method to run the various features highlighted in the newly created classes.
static void Main(string[] args){SetLicense();// Run the tutorial samplesSaveLoad.Run();StandardFeatures.Run();CustomFeatures.Run();ExcludedFeatures.Run();LabeledFeatures.Run();ExecuteFeatures.Run();}
In Solution Explorer, open SaveLoad.cs. Add the following statements to the using block at the top:
using Leadtools.Document.Unstructured.Highlevel;using System.Collections.Generic;
Create a new Run() method to the SaveLoad class. Add the code to the Run() method to execute the features in this class.
public static void Run(){// Create sample featuresvar feature = SampleFeature();// Save to jsonvar json = feature.ToJson();// Load from jsonvar loaded = FeatureResourceBuilder.Build(json);}
Add a new method named SampleFeature(), which will return each IFeature object called from the Run() method. IFeature is the base abstract class for features created to extract form information using automated unstructured forms processing.
Add the code below to the SampleFeature() class to create a custom sample feature.
private static IFeature SampleFeature(){// Create a sample custom featurevar sample = new CustomFeature(){Name = "Sample",Value = new List<InfoValue>() { new InfoValue() { Tweaks = new RegexTweaks(), TweaksForResults = new RegexResultsTweaks(), Pattern = @"\d" } }};return sample;}
In Solution Explorer, open StandardFeatures.cs. Add the following statements to the using block at the top:
using Leadtools.Document.Unstructured.Highlevel;using System.Collections.Generic;
Create a new Run() method to the StandardFeatures class. Add the code to the Run() method to execute the features in this class.
public static void Run(){// Datevar std_feature = StandardDate();// All featuresvar std_all = AllStandardFeatures();}
Add two new methods named StandardDate() and AllStandardFeatures(). Both of these methods are called inside the Run() method, to return the IFeature(s) for data extraction.
Add the code below to the StandardDate() method to create a standard date feature.
private static IFeature StandardDate(){// Standard date featurevar std_date = new StandardFeature() { ValueName = "Date", Name = "Tutorial_Date" };return std_date;}
Add the code below to the StandardDate() method to create a list of features from all the regex expressions in the built-in database.
private static IEnumerable<IFeature> AllStandardFeatures(){foreach (var value in RegexExpressionDb.List("value")){var std = new StandardFeature() { ValueName = value, Name = value };yield return std;}}
In Solution Explorer, open CustomFeatures.cs. Add the following statements to the using block at the top:
using Leadtools.Document.Unstructured.Highlevel;using System.Collections.Generic;
Create a new Run() method to the CustomFeatures class. Add the code to the Run() method to execute the features in this class.
public static void Run(){// Custom feature to find (demo) banking account numbervar custom_feature = Account();}
Add a new method named Account(). Add the code below to the Account() method to return the feature created to find the bank account number.
public static IFeature Account(){var acc = new CustomFeature() { Name = "Account" };acc.Label = new List<InfoLabel>(){new InfoLabel(){Value = new InfoValue(){Pattern="account(\\s)?number",Tweaks=new RegexTweaks(){IgnoreCase=true,IgnoreWhiteSpace=false,FuzzyMatching=FuzzyMatching.Auto,IgnoreIfShorterThan=8,LettersToNumbers=false,MatchWholeWord=false,},TweaksForResults=new RegexResultsTweaks(){IncludeWholeWord=false,IncludeWholeLine=false,}},Where = ECLocation.Right,LocationProximity=5,},new InfoLabel(){Value = new InfoValue(){Pattern="loan(\\s)?number",Tweaks=new RegexTweaks(){IgnoreCase=true,IgnoreWhiteSpace=false,FuzzyMatching=FuzzyMatching.Auto,IgnoreIfShorterThan=8,LettersToNumbers=false,MatchWholeWord=false,},TweaksForResults=new RegexResultsTweaks(){IncludeWholeWord=false,IncludeWholeLine=false,}},Where = ECLocation.Right,LocationProximity =5,},new InfoLabel(){Value = new InfoValue(){Pattern="brokerage(\\s)?cash(\\s)?number",Tweaks=new RegexTweaks(){IgnoreCase=true,IgnoreWhiteSpace=false,FuzzyMatching=FuzzyMatching.Auto,IgnoreIfShorterThan=10,LettersToNumbers=false,MatchWholeWord=false,},TweaksForResults=new RegexResultsTweaks(){IncludeWholeWord=false,IncludeWholeLine=false,}},Where = ECLocation.Right,LocationProximity =5,},new InfoLabel(){Value = new InfoValue(){Pattern="Account(\\s)No.",Tweaks=new RegexTweaks(){IgnoreCase=false,IgnoreWhiteSpace=false,FuzzyMatching=FuzzyMatching.Auto,IgnoreIfShorterThan=8,LettersToNumbers=false,MatchWholeWord=false,},TweaksForResults=new RegexResultsTweaks(){IncludeWholeWord=false,IncludeWholeLine=false,}},Where = ECLocation.Right,LocationProximity =5,},};acc.Value = new List<InfoValue>(){new InfoValue(){Pattern = "\\d{3,4}(-)?\\d{3,14}",Tweaks = new RegexTweaks(){FuzzyMatching=FuzzyMatching.Auto,IgnoreCase=true,IgnoreWhiteSpace=true,IgnoreIfShorterThan=5,LettersToNumbers=true,MatchWholeWord=false,},TweaksForResults = new RegexResultsTweaks(){IncludeWholeWord=true,IncludeWholeLine=false}},new InfoValue(){Pattern = "\\d{3,4}(-|//s)?\\d{3,6}(-|//s)?\\d{3,6}",Tweaks = new RegexTweaks(){FuzzyMatching=FuzzyMatching.Auto,IgnoreCase=true,IgnoreWhiteSpace=true,IgnoreIfShorterThan=5,LettersToNumbers=true,MatchWholeWord=false,},TweaksForResults = new RegexResultsTweaks(){IncludeWholeWord=true,IncludeWholeLine=false}}};return acc;}
In Solution Explorer, open ExcludedFeatures.cs. Add the following statements to the using block at the top:
using Leadtools.Document.Unstructured.Highlevel;using System.Collections.Generic;
Create a new Run() method to the ExcludedFeatures class. Add the code to the Run() method to execute the features in this class.
public static void Run(){// Feature to find emails excluding "[email protected]"var features = new List<IFeature>(){// Emails matchingnew StandardFeature(){ValueName="Email"},// Excluding the exact email belowExcludeExact("[email protected]")};// Now we have a list of features, if executed, it will match all emails except for [email protected]}
Add a new method named ExcludeExact() to the ExcludedFeatures class. This method will be called in the Run() method above. Add the below code to the new method to add a feature that finds emails, excluding emails that are listed in the Run() method.
private static IFeature ExcludeExact(string text){var ex = new CustomFeature() { Name = "Excluded" };ex.Value = new List<InfoValue>(){new InfoValue(){Pattern = text,PatternIsRegex = false,Tweaks = new RegexTweaks(),TweaksForResults = new RegexResultsTweaks()}};ex.Excluded = true;return ex;}
In Solution Explorer, open LabeledFeatures.cs. Add the following statements to the using block at the top:
using Leadtools.Document.Unstructured.Highlevel;using System.Collections.Generic;
Create a new Run() method to the LabeledFeatures class. Add the code to the Run() method to execute the features in this class.
public static void Run(){// Feature for Emails matchingvar feature = new StandardFeature() { ValueName = "Email" };// Add labelAddLabel(feature, "email:");}
Add a new method to the LabeledFeatures class named AddLabel(StandardFeature feature, string labelText). Add the code below to the new method to create a custom label for a custom or standard feature.
private static void AddLabel(StandardFeature feature, string labelText){feature.CustomLabel = true;feature.CustomLabels = new List<InfoLabel>(){new InfoLabel(){Value = new InfoValue(){Tweaks = new RegexTweaks(),TweaksForResults = new RegexResultsTweaks(),// Exact matching label textPattern = labelText,PatternIsRegex = false,},// LocationWhere = ECLocation.Right,// ProximityLocationProximity = 5,},};}
In Solution Explorer, open ExecuteFeatures.cs. Add the following statements to the using block at the top:
using System.Collections.Generic;using System.Threading;using Leadtools.Document;using Leadtools.Document.Unstructured.Highlevel;
Create a new Run() method to the ExecuteFeatures class. This class is used to show how to run the FeaturesProcessingEngine to extract data from a loaded document based on the created features. Add the code to the Run() method to execute the features in this class.
public async static void Run(){// Custom feature to find (demo) banking account numbervar custom_feature = Account();// Load a target documentvar doc_file_name = @"INSERT FILE PATH TO TARGET DOCUMENT";var Document = DocumentFactory.LoadFromFile(doc_file_name, new LoadDocumentOptions());// Create engine to run and execute featuresvar engine = new FeaturesProcessingEngine(true);var results = await engine.Run(new List<IFeature>() { custom_feature }, Document, CancellationToken.None);}
The Account() method used to test the sample document in the Run() method, is the same Account() method in the CustomFeature class, so use that code to add to the ExecuteFeatures class.
Run the project by pressing F5, or by selecting Debug -> Start Debugging.
If the steps were followed correctly, the console appears and the application will execute the code for each sample feature class. To test the ExecuteFeatures class code, ensure that you change the file path to the string value of your test document.
This tutorial showed how to use the LEADTOOLS Document Analyzer to perform high-level API operations.