Highlight Words With the Document Analyzer - WPF C#

This tutorial shows how to highlight words in a loaded document using the DocumentAnalyzer class, according to a JSON ruleset, in a WPF C# application.

Overview  
Summary This tutorial covers how to use LEADTOOLS Document Analyzer in a C# Windows WPF Application.
Completion Time 30 minutes
Visual Studio Project Download tutorial project (10 KB)
Platform WPF C# Application
IDE Visual Studio 2017, 2019
Development License Download LEADTOOLS

Required Knowledge

Get familiar with the basic steps of creating a project by reviewing the Add References and Set a License and Display Files in the Document Viewer tutorials, before working on the Highlight Words With the Document Analyzer - WPF C# tutorial.

Create the Project and Add the LEADTOOLS References

Start with a copy of the project created in the Display Files in the Document Viewer tutorial. If you do not have that project, follow the steps in that tutorial to create it.

The references needed depend upon the purpose of the project. References can be added by one or the other of the following two methods (but not both).

If using NuGet references, this tutorial requires the following NuGet packages:

If using local DLL references, the following DLLs are needed. The DLLs are located at <INSTALL_DIR>\LEADTOOLS21\Bin\Dotnet4\x64:

For a complete list of which DLLs are required for specific features, refer to Files to be Included in your Application.

Set the License File

The License unlocks the features needed for the project. It must be set before any toolkit function is called. For details, including tutorials for different platforms, refer to Setting a Runtime License.

There are two types of runtime licenses:

Note

Adding LEADTOOLS NuGet and local references and setting a license are covered in more detail in the Add References and Set a License tutorial.

Initialize the Document Viewer to use Automated Annotations

With the project created, references added, license set, and code from the Display Files in the Document Viewer tutorial added, coding can begin.

In the Solution Explorer, open MainWindow.xaml. Use the code below in the MainWindow.xaml section to add an Analyze Menu Item just below the <MenuItem Name="_fileLoad label.

<MenuItem Name="_fileAnalyze" Header="Analyze" Click="_fileAnalyze_Click"/> 

Right-click on the Design Window and select View Code, or press F7, to bring up the code behind the Form. Add the following statements to the using block at the top:

C#
// Using block at the top  
using System.Collections.Generic; 
using Leadtools.Annotations.Engine; 
using Leadtools.Annotations.Automation; 
using Leadtools.Annotations.Wpf; 
using Leadtools.Document.Data; 
using Leadtools.Document.Analytics; 
using Leadtools.Document.Unstructured; 

In the InitDocumentViewer() method, change UseAnnotations value to true. It should look like the following:

C#
private void InitDocumentViewer() 
{ 
    var createOptions = new DocumentViewerCreateOptions 
    { 
        // Set the UI part where the Document Viewer is displayed 
        ViewContainer = _centerGrid, 
        // Set the UI part where the Thumbnails are displayed  
        ThumbnailsContainer = _thumbnailsTabPageGrid, 
        // Enable using annotations 
        UseAnnotations = true 
    }; 
    // Now create the viewer  
    docViewer = DocumentViewerFactory.CreateDocumentViewer(createOptions); 
    docViewer.View.ImageViewer.Background = SystemColors.AppWorkspaceBrush; 
    docViewer.View.ImageViewer.Zoom(ControlSizeMode.FitAlways, 1, docViewer.View.ImageViewer.DefaultZoomOrigin); 
    docViewer.Thumbnails.ImageViewer.Background = SystemColors.ControlDarkDarkBrush; 
    cache = new FileCache 
    { 
        CacheDirectory = Path.GetFullPath(@".\CacheDir"), 
    }; 
} 

Add the following lines of code at the bottom of the InitDocumentViewer() method to initialize the Automation Manager and Automation Manager Helper.

C#
var automationManager = documentViewer.Annotations.AutomationManager; 
var automationManagerHelper = new AutomationManagerHelper(automationManager); 

Add the Document Analyzer Code

Add the two lines below inside the _fileLoad_Click event handler, under the OpenFileDialog declaration. For the purposes of this demo we will want to load PDF documents inside the C:\LEADTOOLS21\Resources\Images\Forms\Unstructured directory.

C#
OpenFileDialog ofd = new OpenFileDialog 
{ 
    InitialDirectory = @"C:\LEADTOOLS21\Resources\Images\Forms\Unstructured", 
    Filter = "PDF Files|*.pdf", 
}; 

Use the code below in the _fileAnalyze_Click event handler, to enable a user to load a JSON rule-set. The application will use that rule-set with the DocumentAnalyzer to recognize the related words and highlight them.

C#
private void _fileAnalyze_Click(object sender, RoutedEventArgs e) 
{ 
    string ruleset = null; 
 
    if (virtualDocument.Pages.Count > 0) 
    { 
        // Load JSON Rule-Set 
        OpenFileDialog openRuleset = new OpenFileDialog 
        { 
            InitialDirectory = @"C:\LEADTOOLS21\Resources\Images\Forms\Unstructured", 
            Filter = "Ruleset JSON File (*.json)|*.json", 
        }; 
        var resultDlg = openRuleset.ShowDialog(); 
        if (resultDlg.HasValue && resultDlg.Value) 
            ruleset = openRuleset.FileName; 
 
        if (ruleset != null) 
        { 
            try 
            { 
                using (var engines = UnstructuredOcrEngines.Defaults(@"C:\LEADTOOLS21\Bin\Common\OcrLEADRuntime")) 
                { 
                    // Create Analyzer 
                    DocumentAnalyzer analyzer = new DocumentAnalyzer() 
                    { 
                        Reader = new UnstructuredDataReader() 
                        { 
                            OcrEngines = engines.Engines 
                        }, 
                        QueryContext = new FileRepositoryContext(ruleset) 
                    }; 
 
                    // Add Action to Highlight Results 
                    ActionElementSet actions = new ActionElementSet(); 
                    actions.ActionElements.Add(new MyHighlightAction(docViewer)); 
 
                    DocumentAnalyzerRunOptions options = new DocumentAnalyzerRunOptions() 
                    { 
                        ElementQuery = new RepositoryQuery(), 
                        Actions = actions, 
                    }; 
 
                    docViewer.BeginUpdate(); 
 
                    // Run Analyzer 
                    var results = analyzer.Run(virtualDocument, options); 
 
                    docViewer.EndUpdate(); 
                } 
            } 
            catch (LeadtoolsException ex) 
            { 
                MessageBox.Show(ex.Message); 
            } 
        } 
    } 
    else 
        MessageBox.Show("Load a Document First"); 
} 

Add the Highlight Action Code

Use the code below to add the implementation for MyHighlightAction, that will process the results obtained from the DocumentAnalzyer and create a highlight annotation object corresponding to each recognized item in the DocumentViewer. This code is a separate class inside the MainWindow.xaml.cs file and should be placed outside the MainWindow class block.

C#
public class MyHighlightAction : HighlightAction 
{ 
    private DocumentViewer docViewer; 
    public MyHighlightAction(DocumentViewer documentViewer) 
    { 
        Id = "HIGHLIGHT_DOCUMENT"; 
        docViewer = documentViewer; 
    } 
 
    public override void Run(LEADDocument document, IList<ElementSetResult> results) 
    { 
        MessageBox.Show($"Document Analyzer Done: {(results.Count > 0 ? results[0].Items.Count.ToString() : "No")} matches found."); 
        // Add Redaction Annotations 
        process(document, results); 
    } 
 
    private void process(LEADDocument document, IList<ElementSetResult> results) 
    { 
        foreach (ElementSetResult setResult in results) 
            foreach (ElementResult item in setResult.Items) 
                foreach (LeadRect resultRect in item.ListOfBounds) 
                { 
                    var automation = docViewer.Annotations.Automation; 
                    if (automation != null) 
                    { 
                        var pageContainer = automation.Containers[item.PageNumber - 1]; 
 
                        AnnHiliteObject annHighlight = new AnnHiliteObject(); 
                        annHighlight.Points.Add(resultRect.ToLeadRectD().TopLeft); 
                        annHighlight.Points.Add(resultRect.ToLeadRectD().TopRight); 
                        annHighlight.Points.Add(resultRect.ToLeadRectD().BottomRight); 
                        annHighlight.Points.Add(resultRect.ToLeadRectD().BottomLeft); 
 
                        pageContainer.Children.Add(annHighlight); 
 
                        automation.Invalidate(LeadRectD.Empty); 
                        automation.InvokeAfterObjectChanged(pageContainer.Children, AnnObjectChangedType.Added); 
 
                        if (docViewer.Thumbnails != null) 
                            docViewer.Thumbnails.ImageViewer.InvalidateItemByIndex(0); 
                    } 
                } 
    } 
} 

Run the Project

Run the project by pressing F5, or by selecting Debug -> Start Debugging.

If the steps were followed correctly, the application should run and display that the license was set properly. To test, follow the steps below:

  1. Click on File -> Open to bring up the OpenFileDialog.

  2. Select the file and it will be loaded into the DocumentViewer.

  3. Click on File -> Analyze to bring up the OpenFileDialog again.

  4. Select the JSON rule-set that matches the document loaded in the viewer, then click Open. The application will search for the values that match the rule-set and highlight them.

For samples to use with this tutorial, use the 1040EZ.PDF and the 1040EZ.json rule-set located here: C:\LEADTOOLS21\Resources\Images\Forms\Unstructured

Highlighted Results from the DocumentAnalyzer

Wrap-up

In this tutorial, we covered how to use a JSON rule-set file with the DocumentAnalyzer class to process a loaded document and implement a HighlightAction class to draw an AnnHiliteObject for each matching result in the DocumentViewer.

See Also

Help Version 21.0.2023.3.1
Products | Support | Contact Us | Intellectual Property Notices
© 1991-2021 LEAD Technologies, Inc. All Rights Reserved.

Products | Support | Contact Us | Intellectual Property Notices
© 1991-2021 LEAD Technologies, Inc. All Rights Reserved.