Error processing SSI file
LEADTOOLS Leadtools.Topics.Documents.Converters

Show in webframe

The Document Converters allows conversion from any type of document to another with minimal amount of code.

The input and output document types can be any type of file formats supported by LEADTOOLS. Including but not limited to:

The DocumentConverter class will analyze the input and output documents types and then automatically uses a combination of the LEADTOOLS Raster, SVG and OCR engines to convert the data using the best possible combination of accuracy and speed. Each conversion operation is called a Document Converter Job in the framework.

Input Document

DocumentConverter uses the LEADTOOLS Documents Library to obtain information on the input file. The Document class encapsulates the file format details and returns a uniform set of the functionality needed for reading the pages and parsing the data needed for the conversion job. This includes loading page data as RasterImage or SvgDocument objects, reading the table of content and internal page links and any annotation objects embedded in file or stored in an associated file.

Output Document

The output document file format is divided into two categories:

Conversion Options

The document conversion is designed to run unattended. However, the DocumentConverter provides many options to monitor and modify the operation and to customize the output document as needed. This includes:

Starting Up: DocumentConverter class

The DocumentConverter class is the main entry to the framework. Initialize an instance of this class to be used for converting one or more documents and then set these options:

> > >
Member Description
SetOcrEngineInstance IOcrEngine to use for parsing text and objects when SVG is not available in the input document.
SetDocumentWriterInstance DocumentWriter to use when creating the output file when document format output is selected.
SetAnnRenderingEngineInstance Optional rendering engine to use when the annotations are overlaid on top of images.
LoadDocumentOptions Options to use when loading the input document.
Preprocessor The pre-processing options to use for cleaning up the images of the input document.
Options Extra optional options to use during the conversion such as error recovery mode and page number template.
Diagnostics Options for logging such as enabling standard .NET tracing.

Creating Jobs

Once the DocumentConverter class is initialized, use the DocumentConverterJobs class (accessed through DocumentConverter.Jobs property) to create new conversion jobs.

The parameters for a job are set in a DocumentConverterJobData structure. This contains the following members:

Member Description
Document Document object to be used as the input of the conversion. Either this or InputDocumentFileName are used.
InputDocumentFileName Path to the input file for the conversion. Either this or Document are used.
InputAnnotationsFileName Path to the file containing the annotations file to be added to the output document. Optional.
InputDocumentFirstPageNumber The number of the first page to be converted from the input document. Optional.
InputDocumentLastPageNumber The number of the last page to be converted from the input document. Optional.
DocumentFormat The output format when document conversion is used.
RasterImageFormat The output format when raster conversion is used.
RasterImageBitsPerPixel The bits per pixel of the output file when raster conversion is used.
OutputDocumentFileName Name of the output file to be generated by this conversion.
OutputAnnotationsFileName Name of the file that will contain the annotations parsed from the input document. Optional.
AnnotationsMode Customizes how the annotations are saved in the output document.
JobName Optional name of this job. Useful when tracing is enabled.
UserData Optional user-defined object that can be used a long side the job events to pass application specified data.

The DocumentConverterJobs.CreateJobData overloaded methods can also be used to quickly create jobs from common input and output options.

When all the options are set, the DocumentConverterJobs.CreateJob method is used to create an instance of the DocumentConverterJob class that holds the job options as well the its status. This object will then passed to DocumentConverterJobs.RunJob or DocumentConverterJobs.RunJobAsync to run the operation.

Running Jobs

DocumentConverterJobs.RunJob or DocumentConverterJobs.RunJobAsync are used to run the job from the data created in the previous section. While the job is running, the DocumentConverterJobs.JobStarted (once), DocumentConverterJobs.JobOperation (more than one) and DocumentConverterJobs.JobCompleted (once) events will fire to indicate the job progress.

The data for the events of type DocumentConverterJobEventArgs and contains all the necessary information on the current job and its status:

Member Description
Job The actual job object that was passed to RunJob or RunJobAsync.
Status The current status of the job and whether it is still running or has been aborted. The user can abort any running jobs by modifying this property.
Operation Current operation being performed by the converter.
IsPostOperation Whether this event is being fired before or after Operation.
InputDocumentPageNumber Current page number in the input document.
OutputDocumentPageNumber Current page number in the output document.
Document The Document object being used by this conversion.
DocumentWriter The DocumentWriter object being used by this operation if document conversion is used.
OcrDocument The OCR document object being used if this operation is using OCR conversion.
OcrPage The OCR page object being used if this operation is using OCR conversion.
SvgDocument The SVG document being used if this operation is using SVG conversion.
OcrPageImage The raster image object for the current page if this operation is using OCR conversion.
RasterImage The raster image being used if this operation is using raster conversion.
AnnContainer Annotation container being used if annotation conversion is used.
AnnotationsMode Current annotations conversion mode.

For more information on these members and how they can be used or modified, refer to DocumentConverterJobOperation.

The InputDocumentPageNumber property can be used to show a progress bar indicator of the current conversion operation.

Completing Jobs

The job is completed when the RunJob method returns. If RunJobAsync was used, then the JobCompleted should be used to indicate when the job is completed. In both case, the DocumentConverterJob object passed will contain information on the status of this operation as follows:

Member Description
Status The job status. This can be success, success but with errors or aborted.
Errors A list of any errors that might have occurred during the conversion.
JobData The original options used to create this job.
DocumentConverter The document converter object used to run the job.

Multi-Threading

DocumentConverter is multi-threaded safe. The RunJobAsync method can be used to run multiple jobs at the same time and run them in separate threads. Internally, the converter uses the .NET Thread Pool exclusively for creating and managing threads.

RunJobAsync will perform sanity check on the options and then start the job and return control to user immediately. The , JobOperation and JobCompleted events can be used to monitor the jobs status and to be notified when a job is completed. AbortAllJobs can be used at any time to abort all running and cancel any pending jobs.

Documents Library Features
Using LEADTOOLS Document Viewer

Error processing SSI file