Document Analyzer SDK Libraries

The LEADTOOLS Document Analyzer SDK Library intelligently identifies document components and zones in text-based office documents (DOC, DOCX, XLS, XLX), PDFs, and document images (JPG, TIFF, PNG, PDF) for automated document processing and smart data extraction. Using Intelligent Document Processing (IDP), the LEADTOOLS Document Analyzer automatically finds key phrases within structured and unstructured documents and extracts data throughout the noise, even if the layouts between files are completely different. Additionally, the component performs deep analysis to further improve detection ensuring that nothing is missed and all data of interest is found. .NET 6+, .NET Framework, Xamarin, UWP, C#, VB, Java, HTML / JavaScript, and Python developers building applications to handle processing various types of forms and documents will benefit greatly by integrating this easy-to-use API.

placeholder
Feature-rich Document Analyzer SDK

Intelligent Document Processing (IDP) Component

LEAD’s investment in AI and machine learning is showcased in the Document Analyzer SDK, which automatically detects and extracts data from any type of structured or unstructured form, document, or image with simple rule-based configurations.

All Document Analyzer features are provided without the need of additional 3rd-party tools or applications. Some of those features include:

  • Location search, including relative locations
  • Conditional search to match and filter the results
  • Partial and full match Regex support
  • Predefined rules for some common data types like SSN, ID number, TaxID, Address, Email address and more
  • Functions to add custom rulesets that find, collect, and act upon information of interest
  • Actions such as redact, highlight, and extract can be applied to data of interest
  • Handles various data formats, including tables, text flows, data across multiple lines
Intelligently extract data

Smart Data Extraction

Harnessing the power of LEAD’s Forms Recognition and Processing libraries, the Document Analyzer intelligently extracts text, paragraphs, or any key-value from text-based office documents (DOC, DOCX, XLS, XLX), PDFs, and document images (JPG, TIFF, PNG, PDF) based on rules. This smart data extraction automatically finds key phrases working with structured and unstructured documents such as invoices, statements, bills of lading, and receipts, even if the layouts between files are completely different. Additionally, the component performs deep analysis to further improve detection ensuring that all data of interest is found and nothing slips through.

Work with all types of input

Analyze Any Input—Even Mixed Content

The Document Analyzer works on all types of input, including text-based files, image-based files, or files with mixed text and image content, using the seamless integration of the LEADTOOLS proprietary OCR technology built with patented machine learning algorithms.

Confidence ratings

Confidence Ratings Provided

The Document Analyzer provides users a confidence ratings to individually accept or decline the value recognized. A solution developer can use the rating to automatically accept or reject recognized values with full control of the following workflow.

Efficiently store data

Save Space in your Document Management System

Considering all the documents with sensitive data being processed regularly within various industries such as health care, finance, and insurance—a common pain point is manual data redaction and file storage. Having to manually redact documents and store both the redacted and unredacted files within a document management system can take up a lot of time and space. By leveraging the powerful machine vision libraries within the LEADTOOLS Document Analyzer, users need only to store the unredacted files and the system can automatically redact on-the-fly when a file is requested.

Several ways to use the analyzer

An Interface for Any User

The Document Analyzer is provided as a configuration driven application for ease of use and as .NET and Java classes for the ultimate in flexibility.

Document Analyzer Development Made Easy

Easy to Integrate

LEADTOOLS handles the heavy lifting, eliminating months of R&D, while giving you the best quality and performance available. You'll be free to focus on other components of your application. Download the LEADTOOLS evaluation to streamline your development.

Cross platform libraries

Document Analyzer SDK Platforms and Programming Interfaces

Operating Systems

Projects that use LEADTOOLS Document Analyzer libraries can be deployed to Windows, Linux, macOS, Android, iOS, and Web devices.

Frameworks

Developers that are leveraging these frameworks can utilize the Document Analyzer SDK: .NET 6+, .NET Framework, Xamarin, UWP, WinForms, ASP.NET, and Web Services / Web API (JSON, SOAP, REST)

Programming, Scripting, Markup

Document Analyzer code snippets and demo applications are provided for the following: C#, VB, XAML, Java, HTML / JavaScript, and Python

Start Coding With LEADTOOLS Document Analyzer


Document Analyzer libraries as well as all LEADTOOLS Recognition, Document, Medical, Vector, and Imaging technologies for all development and target platforms, including Windows, Linux, and macOS.

Download LEADTOOLS Libraries