LEADTOOLS Document Readers

Show in webframe

The LEADTOOLS Document Readers allow reading images, thumbnails, text data and metadata from any of the supported types using a uniform set of methods and properties, regardless of the document type.

The current implementation of the LEADTOOLS Document Readers support reading the following document types:

PDF - document text is parsed without the need of an OCR engine.
XPS - document text is parsed without the need of an OCR engine.
Raster - Raster Image files such as TIFF and JPEG documents. An OCR engine is required to parse the text of the document.

LEADTOOLS will add more document readers and functionality in the near future for documents such as DICOM, DOC/DOCX(2007/2010), XLS/XLSX(2007/2010) and RTF. More objects types such as images, bookmarks, hyperlinks and annotations will also be added in the near future. Currently, support for these formats is provided by the Raster document reader (with text parsing supported by an external OCR engine).

Key Features:

Support for PDF/A. PDF/A is a subset of PDF obtained by leaving out PDF features not suited to long-term archiving. The resulting document is guaranteed to look exactly like the original version when viewed on the target machine. Saving with PDF/A document type may result in larger output file sizes.
Extra options for the various flavors of the output documents supported. For example, adding page numbers and breaks to text files and frames (text box objects) in DOC files.
Support provided to monitor the document creation operation through a progress event. Your application can provide a visual feedback using a progress bar and a cancel button to the user to allow both monitoring and abortion of the current operation

Supported Platforms

Document Readers .NET