Scan Invoice Forms to OCR and Extract Items from an Invoice Document

Posted on 2022-05-23T14:57:51.000Z Ryan Fritz

PDF

In my last post, we discussed the benefits of using LEADTOOLS Document SDK and the Forms Recognition Technology to vastly improve productivity in a paperless office receiving hundreds of different forms and invoices. This will be a continuation of that to showcase not only how to automatically recognize which type of filled form is currently being processed, but to also extract the information and data on said filled forms

To do this, we mainly must take one more step forward in looking at the fields. Fields are locations set when creating a master form that our processing engine will look for data in those areas. There are many types of fields, such as Text, Image, Table, OMR, and Barcode. What our processing engine does is loading and processing all the fields for the user to then write how they want that information distributed. It is often a good practice to check to see what type of field was processed and then write code for that type accordingly.

Shown below are code snippets showcasing the determination of which type of master form the filled form being processed is, the loading and processing of fields, checking to see which field type the current field is, and displaying that field information to the console.

Finding the Right Match

For this example, we are going to make a couple changes to last blog's RecognizeForms function.

private static void RecognizeForms()
{
    Console.WriteLine("Recognizing Forms\n");

    string[] formsToRecognize = Directory.GetFiles(filledFormsDirectory, "*.tif", SearchOption.AllDirectories);

    string[] masterFileNames = Directory.GetFiles(masterFormsDirectory, "*.bin", SearchOption.AllDirectories);

    foreach (string filledFormName in formsToRecognize)
    {
        RasterImage currentForm = codecs.Load(filledFormName, 0, CodecsLoadByteOrder.BgrOrGray, 1, -1);
        FormRecognitionAttributes filledFormAttributes = LoadFilledFormAttributes(currentForm);

        string resultMessage = "";

        foreach (string masterFileName in masterFileNames)
        {
            FormRecognitionAttributes masterFormAttributes = LoadMasterFormAttributes(masterFileName);

            //Compares the master form to the filled form
            FormRecognitionResult recognitionResult = recognitionEngine.CompareForm(masterFormAttributes, filledFormAttributes, null);

            //When the Recognition Engine compares the two documents it also sets a confidence level for how closely the engine thinks the two documents match
            if (recognitionResult.Confidence >= AllowedConfidenceLevel)
            {
                resultMessage = $"Form {Path.GetFileNameWithoutExtension(filledFormName)} has been recognized as a(n) {Path.GetFileNameWithoutExtension(masterFileName)} with a confidence level of {recognitionResult.Confidence}";

                //Once we found the right master form we can read the filled form
                FormPages filledFormData = ProcessForm(recognitionResult, masterFileName, currentForm);
                PrintFormData(filledFormData);

                break;
            }

            resultMessage = $"The form {Path.GetFileNameWithoutExtension(filledFormName)} failed to be recognized with a confidence level of {recognitionResult.Confidence}";
        }

        Console.WriteLine(resultMessage);
        Console.WriteLine("=============================================================\n");
    }
}

Notice that in this example we are now Processing the filled form data and printing that information to the console.

Processing Filled Form Data

Extracting text from the filled forms requires us to pass information gathered by the Recognition Engine to the Processing Engine. Additionally, we will have to load the master form's XML file into the Processing Engine to tell it which field to collect data on. Once the Processing Engine is prepared we then read the information off of the filled form.

private static FormPages ProcessForm(FormRecognitionResult recognitionResult, string masterFormFileName, RasterImage filledForm)
{
    // The Recognition Engine records how the master form and the filled form align page by page
    List<PageAlignment> alignment = new List<PageAlignment>();
    for (int k = 0; k < recognitionResult.PageResults.Count; k++)
    {
        alignment.Add(recognitionResult.PageResults[k].Alignment);
    }

    // Load the Processing Engine with the found master form
    string fieldsfName = Path.GetFileNameWithoutExtension(masterFormFileName) + ".xml";
    string fieldsfullPath = Path.Combine(masterFormsDirectory, fieldsfName);

    processingEngine.LoadFields(fieldsfullPath);

    // Processing Engine reads filled form
    processingEngine.Process(filledForm, alignment);
    return processingEngine.Pages;
}

Print Filled Form Data

For this example, we are only reading the information off of the filled form and printing it to the console.

private static void PrintFormData(FormPages formData)
{    
    foreach (FormPage formPage in formData)
        foreach (FormField field in formPage)
        {
            List<string> row = new List<string>() { };

            if (field is TextFormField)
                row.AddRange(ReadTextFormField(field));

            else if (field is TableFormField)
                row.AddRange(ReadTableFormField(field));

            else if (field is UnStructuredTextFormField)
                row.AddRange(ReadUnStructuredTextFormField(field));
            
            row.Insert(0, "Field Name: " + field.Name);
            row.Add("Field Bounds: " + field.Bounds.ToString() + "\n------------------------------------------------------------");
            
            foreach (string line in row) Console.WriteLine(line);
        }
}

As you can see we break down how to present the filled form information based on the type of field that is being read.

Reading Text Form Fields

Here we read the text from the text form field result and as well as the confidence the OCR engine has that it read the correct characters.

private static List<string> ReadTextFormField(FormField field)
{
    List<string> row = new List<string>();

    row.Add("Field Type: Text");
    row.Add("Field Value: " + ((field as TextFormField).Result as TextFormFieldResult).Text + "");

    if (((field as TextFormField).Result as TextFormFieldResult).AverageConfidence < AllowedConfidenceLevel)
    {
        row.Add("Field Confidence: " + ((field as TextFormField).Result as TextFormFieldResult).AverageConfidence.ToString() + "% ---> Needs manual review");
    }
    else
        row.Add("Field Confidence: " + ((field as TextFormField).Result as TextFormFieldResult).AverageConfidence.ToString() + "%");            

    return row;
}

Reading Table Form Fields

Reading table data is similar to reading the data from a text form field. However, with a table we have to read the results from every row and column present on the table. Here is an example of how to read table values.

private static List<string> ReadTableFormField(FormField field)
{
    List<string> row = new List<string>();

    List<TableColumn> col = (field as TableFormField).Columns;
    TableFormFieldResult results = (field as TableFormField).Result as TableFormFieldResult;
    row.Add("Field Type: Table");

    for (int i = 0; i < results.Rows.Count; i++)
    {
        TableFormRow rows = results.Rows[i];

        row.Add($"------------------Table Row Number: {i + 1}-----------------------\n");

        int lineCounter = 1;
        string[] rowInfo = new string[rows.Fields.Count];
        for (int j = 0; j < rows.Fields.Count; j++)
        {
            OcrFormField ocrField = rows.Fields[j];
            TextFormFieldResult txtResults = ocrField.Result as TextFormFieldResult;
            if (txtResults.AverageConfidence >= AllowedConfidenceLevel)
            {
                rowInfo[j] = txtResults.Text;
                int counter = 1;

                if (txtResults.Text != null)
                    counter += CountCharacterInString(txtResults.Text, '\n');

                if (counter > lineCounter)
                    lineCounter = counter;
            }
            else
            {
                row.Add("% ---> Needs manual review\n");
                manualReviewCount++;
                rowInfo[j] = txtResults.Text;
                int counter = 1;

                if (txtResults.Text != null)
                    counter += CountCharacterInString(txtResults.Text, '\n');

                if (counter > lineCounter)
                    lineCounter = counter;
            }

        }
        for (int k = 0; k < rowInfo.Length; k++)
        {
            row.Add(col[k].OcrField.Name + ": " + rowInfo[k]);
        }
    }
    row.Add("------------------------------------------------------------");

    if (((field as TableFormField).Result as TableFormFieldResult).Status == FormFieldStatus.Failed)
    {
        row.Add("Field Confidence: % ---> Needs manual review");
        manualReviewCount++;
    }
    else
        row.Add("Field Confidence: Successful");

    return row;
}

private static int CountCharacterInString(String str, char c)
{
    int counter = 0;

    for (int i = 0; i < str.Length; i++) if (str[i] == c) counter++;

    return counter;
}

Reading Unstructured Text Form Fields

Unstructured text form fields just represent a rectangular region on a form that doesn't have a predefined structure like a text field or a table would. For this example we are going to read the unstructured data the same way we read the text field data.

private static List<string> ReadUnStructuredTextFormField(FormField field)
{
    List<string> row = new List<string>();

    row.Add("Field Type: UnStructuredText");
    row.Add("Field Value: " + ((field as UnStructuredTextFormField).Result as TextFormFieldResult).Text);

    if (((field as UnStructuredTextFormField).Result as TextFormFieldResult).AverageConfidence < AllowedConfidenceLevel)
    {
        row.Add("Field Confidence: " + ((field as UnStructuredTextFormField).Result as TextFormFieldResult).AverageConfidence.ToString() + "% ---> Needs manual review");
        manualReviewCount++;
    }
    else
        row.Add("Field Confidence: " + ((field as UnStructuredTextFormField).Result as TextFormFieldResult).AverageConfidence.ToString() + "%");

    return row;
}

These are just a few examples of the many different fields that the Processing Engine can handle. Our documentation has a full list of all of the supported form field input types.

Free Evaluation, Free Technical Support, Free Demos, & More!

Download our FREE 60-day evaluation and test all features and actually program before a purchase is even made. Gain access to our extensive documentation, sample source code, demos, and tutorials.

See for yourself – Get a Free evaluation

Download the LEADTOOLS SDK for free. It’s fully-functional for 60 days and comes with free chat and email support.

Need Assistance? LEAD Is Here For You

Need help while you wait for more blogs and tutorials? Contact our support team for free technical support! For pricing or licensing questions, you can contact our sales team via email or call us at 704-332-5532.

LEADTOOLS Blog

LEADTOOLS Powered by Apryse,the Market Leading PDF SDK,All Rights Reserved