Intelligent Document Processing (IDP) Component
LEAD’s investment in AI and machine learning is showcased in the Document Analyzer SDK, which automatically detects and extracts data from any type of structured or unstructured form, document, or image with simple rule-based configurations.
All Document Analyzer features are provided without the need of additional 3rd-party tools or applications. Some of those features include:
- Location search, including relative locations
- Conditional search to match and filter the results
- Partial and full match Regex support
- Predefined rules for some common data types like SSN, ID number, TaxID, Address, Email address and more
- Functions to add custom rulesets that find, collect, and act upon information of interest
- Actions such as redact, highlight, and extract can be applied to data of interest
- Handles various data formats, including tables, text flows, data across multiple lines
Smart Data Extraction
Harnessing the power of LEAD’s Forms Recognition and Processing libraries, the Document Analyzer intelligently extracts text, paragraphs, or any key-value from text-based office documents (DOC, DOCX, XLS, XLX), PDFs, and document images (JPG, TIFF, PNG, PDF) based on rules. This smart data extraction automatically finds key phrases working with structured and unstructured documents such as invoices, statements, bills of lading, and receipts, even if the layouts between files are completely different. Additionally, the component performs deep analysis to further improve detection ensuring that all data of interest is found and nothing slips through.
Analyze Any Input—Even Mixed Content
The Document Analyzer works on all types of input, including text-based files, image-based files, or files with mixed text and image content, using the seamless integration of the LEADTOOLS proprietary OCR technology built with patented machine learning algorithms.
Confidence Ratings Provided
The Document Analyzer provides users a confidence rating to individually accept or decline the value recognized. A solution developer can use the rating to automatically accept or reject recognized values with full control of the following workflow.
Save Space in your Document Management System
Considering all the documents with sensitive data being processed regularly within various industries such as health care, finance, and insurance—a common pain point is manual data redaction and file storage. Having to manually redact documents and store both the redacted and unredacted files within a document management system can take up a lot of time and space. By leveraging the powerful machine vision libraries within the LEADTOOLS Document Analyzer, users need only to store the unredacted files and the system can automatically redact on-the-fly when a file is requested.
An Interface for Any User
The Document Analyzer is provided as a configuration driven application for ease of use and as .NET and Java classes for the ultimate in flexibility.
Easy to Integrate
LEADTOOLS handles the heavy lifting, eliminating months of R&D, while giving you the best quality and performance available. You'll be free to focus on other components of your application. Download the LEADTOOLS evaluation to streamline your development.
Document Analyzer SDK Platforms and Programming Interfaces
Operating Systems
Projects that use LEADTOOLS Document Analyzer libraries can be deployed to Windows, Linux, macOS, Android, iOS, and Web devices.
Frameworks
Developers that are leveraging these frameworks can utilize the Document Analyzer SDK: .NET 6+, .NET Framework, .NET MAUI, Xamarin, UWP, WinForms, ASP.NET, and Web Services / Web API (JSON, SOAP, REST)
Programming, Scripting, Markup
Document Analyzer code snippets and demo applications are provided for the following: C#, VB, XAML, Java, and HTML / JavaScript
Start Coding With LEADTOOLS Document Analyzer
Document Analyzer libraries as well as all LEADTOOLS Recognition, Document, Medical, Vector, and Imaging technologies for all development and target platforms, including Windows, Linux, and macOS.