This tutorial shows how to highlight words in a loaded document using the DocumentAnalyzer class, according to a JSON ruleset, in a WinForms C# application.
Overview | |
---|---|
Summary | This tutorial covers how to use LEADTOOLS Document Analyzer in a C# Windows WinForms Application. |
Completion Time | 30 minutes |
Visual Studio Project | Download tutorial project (4 KB) |
Platform | WinForms C# Application |
IDE | Visual Studio 2022 |
Development License | Download LEADTOOLS |
Get familiar with the basic steps of creating a project by reviewing the Add References and Set a License and Display Files in the Document Viewer tutorials, before working on the Highlight Words with the Document Analyzer - WinForms C# tutorial.
Start with a copy of the project created in the Display Files in the Document Viewer tutorial. If you don't have that project, follow the steps in that tutorial to create it.
The references needed depend upon the purpose of the project. References can be added by one or the other of the following two methods (but not both). For this project, the following references are needed:
If NuGet references are used, this tutorial requires the following NuGet packages:
Leadtools.Annotations.Winforms
Leadtools.Document.Sdk
Leadtools.Document.Viewer.WinForms
Newtonsoft.Json
If local DLL references are used, the following DLLs are needed. The DLLs are located at <INSTALL_DIR>\LEADTOOLS23\Bin\net
:
Leadtools.Annotations.Automation.dll
Leadtools.Annotations.Engine.dll
Leadtools.Annotations.WinForms.dll
Leadtools.Caching.dll
Leadtools.Controls.WinForms.dll
Leadtools.Core.dll
Leadtools.dll
Leadtools.Document.Analytics.dll
Leadtools.Document.Pdf.dll
Leadtools.Document.Unstructured.dll
Leadtools.Document.Viewer.WinForms.dll
Leadtools.Document.dll
Leadtools.Ocr.LEADEngine.dll
Leadtools.Ocr.dll
For a complete list of which DLLs are required for specific features, refer to Files to be Included in your Application.
The License unlocks the features needed for the project. It must be set before any toolkit function is called. For details, including tutorials for different platforms, refer to Setting a Runtime License.
There are two types of runtime licenses:
Note: Adding LEADTOOLS NuGet and local references and setting a license are covered in more detail in the Add References and Set a License tutorial.
With the project created, references added, license set, and code from the Display Files in the Document Viewer tutorial added, coding can begin.
In the Solution Explorer, open Form1.cs
. Right-click on the Design Window
and select View Code
, or press F7, to bring up the code behind the Form. Add the following statements to the using
block at the top:
using Leadtools;
using Leadtools.Document;
using Leadtools.Caching;
using Leadtools.Document.Viewer;
using Leadtools.Controls;
using Leadtools.Annotations.Engine;
using Leadtools.Annotations.Automation;
using Leadtools.Annotations.WinForms;
using Leadtools.Document.Data;
using Leadtools.Document.Analytics;
using Leadtools.Document.Unstructured;
In the InitDocumentViewer()
method, change createOptions.UseAnnotations
value to true
.
var createOptions = new DocumentViewerCreateOptions();
// Set the UI part where the Document Viewer is displayed
createOptions.ViewContainer = this.Controls.Find("docViewerPanel", false)[0];
// Set the UI part where the Thumbnails are displayed
createOptions.ThumbnailsContainer = this.Controls.Find("thumbPanel", false)[0];
// Enable using annotations
createOptions.UseAnnotations = true;
// Now create the viewer
documentViewer = DocumentViewerFactory.CreateDocumentViewer(createOptions);
Add the following lines of code to initialize the Automation Manager and Automation Manager Helper:
var automationManager = _documentViewer.Annotations.AutomationManager;
var automationManagerHelper = new AutomationManagerHelper(automationManager);
Use the code below in the InitUI()
method to add an Analyze Button.
var analyzeButton = new Button();
analyzeButton.Name = "analyzeButton";
analyzeButton.Text = "&Analyze";
analyzeButton.Location = new System.Drawing.Point(loadButton.Location.X + loadButton.Width, loadButton.Location.Y);
analyzeButton.Click += (sender, e) => AnalyzeDocument(analyzeButton);
topPanel.Controls.Add(analyzeButton);
Add the two lines below inside the LoadDocument()
method, under the OpenFileDialog declaration. For the purposes of this demo we will want to load PDF documents inside the C:\LEADTOOLS23\Resources\Images\Forms\Unstructured
directory.
ofd.InitialDirectory = @"C:\LEADTOOLS23\Resources\Images\Forms\Unstructured";
ofd.Filter = "PDF Files|*.pdf";
Use the code below in the AnalyzeDocument()
method to enable a user to load a JSON rule-set and use it with the DocumentAnalyzer
to recognize the related words and highlight them.
private void AnalyzeDocument(Button analyzeButton)
{
string ruleset = null;
if (virtualDocument.Pages.Count > 0)
{
// Load JSON Rule-Set
OpenFileDialog openRuleset = new OpenFileDialog();
openRuleset.InitialDirectory = @"C:\LEADTOOLS23\Resources\Images\Forms\Unstructured";
openRuleset.Filter = "Ruleset JSON file (*.json)|*.json";
if (openRuleset.ShowDialog() == DialogResult.OK)
{
ruleset = openRuleset.FileName;
}
if (ruleset != null)
{
try
{
// Create Analyzer
DocumentAnalyzer analyzer = new DocumentAnalyzer()
{
Reader = new UnstructuredDataReader(),
QueryContext = new FileRepositoryContext(ruleset)
};
// Add Action to Highlight Results
ActionElementSet actions = new ActionElementSet();
actions.ActionElements.Add(new MyHighlightAction(_documentViewer));
DocumentAnalyzerRunOptions options = new DocumentAnalyzerRunOptions()
{
ElementQuery = new RepositoryQuery(),
Actions = actions
};
_documentViewer.BeginUpdate();
// Run Analyzer
var results = analyzer.Run(_virtualDocument, options);
_documentViewer.EndUpdate();
}
catch (LeadtoolsException ex)
{
MessageBox.Show(ex.Message);
}
}
}
else
{
MessageBox.Show("Load a Document First");
}
}
Use the code below to add the implementation for MyHighlightAction
, that will process the results obtained from the DocumentAnalzyer
and create a highlight annotation object corresponding to each recognized item in the DocumentViewer
.
public class MyHighlightAction : HighlightAction
{
private DocumentViewer docViewer;
public MyHighlightAction(DocumentViewer documentViewer)
{
Id = "HIGHLIGHT_DOCUMENT";
docViewer = documentViewer;
}
public override void Run(LEADDocument document, IList<ElementSetResult> results)
{
MessageBox.Show($"Document Analyzer Done: {(results.Count > 0 ? results[0].Items.Count.ToString() : "No")} matches found.");
// Add Redaction Annotations
process(document, results);
}
private void process(LEADDocument document, IList<ElementSetResult> results)
{
foreach (ElementSetResult setResult in results)
foreach (ElementResult item in setResult.Items)
foreach (LeadRect resultRect in item.ListOfBounds)
{
var automation = docViewer.Annotations.Automation;
if (automation != null)
{
var pageContainer = automation.Containers[item.PageNumber - 1];
AnnHiliteObject annHighlight = new AnnHiliteObject();
annHighlight.Points.Add(resultRect.ToLeadRectD().TopLeft);
annHighlight.Points.Add(resultRect.ToLeadRectD().TopRight);
annHighlight.Points.Add(resultRect.ToLeadRectD().BottomRight);
annHighlight.Points.Add(resultRect.ToLeadRectD().BottomLeft);
pageContainer.Children.Add(annHighlight);
automation.Invalidate(LeadRectD.Empty);
automation.InvokeAfterObjectChanged(pageContainer.Children, AnnObjectChangedType.Added);
if (docViewer.Thumbnails != null)
docViewer.Thumbnails.ImageViewer.InvalidateItemByIndex(0);
}
}
}
}
Run the project by pressing F5, or by selecting Debug -> Start Debugging.
If the steps were followed correctly, the application runs and after loading a document into the Document Viewer, the Analyze button can be used to perform analysis according to a loaded JSON rule-set. Once analysis is complete, a message box will give the results, and a highlight annotation will be drawn for each result.
For samples to use with this tutorial, use the 1040EZ.PDF
and the 1040EZ.json
rule-set located here: C:\LEADTOOLS23\Resources\Images\Forms\Unstructured
In this tutorial, we covered how to use a JSON rule-set file with the DocumentAnalyzer
class to process a loaded document and implement a HighlightAction
class to draw an AnnHiliteObject
for each matching result in the DocumentViewer
.