(Deprecated) The document formats supported by LEADTOOLS OCR toolkit.

Syntax

Visual Basic (Declaration)
<ObsoleteAttribute("Use Leadtools.Forms.DocumentWriters.DocumentFormat instead")> <SerializableAttribute()> Public Enum OcrDocumentFormat Inherits Enum

Visual Basic (Usage)	Copy Code
`Dim instance As OcrDocumentFormat`

C#
[ObsoleteAttribute("Use Leadtools.Forms.DocumentWriters.DocumentFormat instead")] [SerializableAttribute()] public enum OcrDocumentFormat : Enum

C++/CLI
[ObsoleteAttribute("Use Leadtools.Forms.DocumentWriters.DocumentFormat instead")] [SerializableAttribute()] public enum class OcrDocumentFormat : public Enum

Members

Member	Description
AsciiText	ASCII Text. This is the most basic format and the document will be a text file with line break after each line. If table is present, its cells are positioned by tabs. The text returned by RecognizeText uses this format. Note: Use DocumentFormat.Text instead.
AsciiTextLayoutRetained	ASCII Text output, layout retention with mimicked spaces. Line/cell contents are surrounded by quotes (""). Note: Use DocumentFormat.Text instead.
AsciiTextCommaDelimited	ASCII Comma delimited text output. Line/cell contents are surrounded by quotes (""). Note: Use DocumentFormat.Text instead.
AsciiTextFormatted	ASCII Text output allowing quick text conversion. Line break after each line and after each zone. Note: Use DocumentFormat.Text instead.
UnicodeText	UNICODE Text with line break after each line. If a table is present, its cells are positioned by tabs. Note: Use DocumentFormat.Text instead.
UnicodeTextLayoutRetained	UNICODE Text output, layout retention with mimicked spaces. Line/cell contents are surrounded by quotes (""). Note: Use DocumentFormat.Text instead.
UnicodeTextCommaDelimited	UNICODE Text with line break after each line. If table is present, its cells are positioned by tabs. Note: Use DocumentFormat.Text instead.
UnicodeTextFormatted	UNICODE Text output allowing quick text conversion. Line break after each line and after each zone. Note: Use DocumentFormat.Text instead.
Html32	HTML output. HTML 3.2 is useful to export with partial formating. The output files support all major browsers. Note: Use DocumentFormat.Html instead.
Html40	HTML output.HTML 4.0 can set the exact position/size of objects, use this output format with full formatting. Note: Use DocumentFormat.Html instead.
Word97	Microsoft Word 97 (doc) output format. Note: Use DocumentFormat.Doc instead.
Word2000	Microsoft Word 2000 (doc) output format. Note: Use DocumentFormat.Doc instead.
Word2003	Microsoft Word 2003 (doc) output format. Note: Use DocumentFormat.Doc instead.
WordML	Microsoft Office Open XML (docx) output format. Note: The LEADTOOLS Document Writers does not currently support an equivalent to this format.
Excel97	Microsoft Excel 97 (xls) output format. Note: The LEADTOOLS Document Writers does not currently support an equivalent to this format.
Excel2000	Microsoft Excel 2000 (xls) output format. Note: The LEADTOOLS Document Writers does not currently support an equivalent to this format.
Rtf	Rich Text Format for Word 97 and later. Note: Use DocumentFormat.Rtf instead.
RtfWordPad	Rich Text Format for Microsoft WordPad. Note: Use DocumentFormat.Rtf instead.
InfoPath	Microsoft InfoPath XML document output format. Note: The LEADTOOLS Document Writers does not currently support an equivalent to this format.
Pdf	Adobe PDF. Displaying the generated PDF file in a PDF-reader results in a very similar look to the original document. The text can be searched. The PDF file contains the recognized characters in the same positions as in the original. The original page image is overlaid on top of the PDF document. Note: Use DocumentFormat.Pdf instead.
PdfImage	Adobe PDF with raster image only. Note: Use DocumentFormat.Pdf instead.
PdfText	Adobe PDF with text only. The text can be searched. The PDF file contains the recognized characters in the same positions as in the original. The original page image is not overlayed ontop of the PDF document. Note: Use DocumentFormat.Pdf instead.
PdfEdited	Adobe PDF with text and image. Use this format if you have used IOcrPage.SetRecognizedCharacters to insert or delete characters in the recognized data. The engine will re-arrange the character boxes before saving the result PDF file. Note: Use DocumentFormat.Pdf instead.
PdfWithImageSubstitutes	Adobe PDF with text only. Missing and rejected characters are replaced by small images from the original page resulting in a better looking document than PdfText. The text can be searched. The PDF file contains the recognized characters in the same positions as in the original. Note: Use DocumentFormat.Pdf instead.
PdfA	Adobe PDF/A format. The original page image is overlaid on top of the PDF document. Optimized for the long-term archiving of electronic documents and is based on the PDF Reference Version 1.4 from Adobe Systems Inc. (implemented in Adobe Acrobat 5). Note: Use DocumentFormat.Pdf instead.
PdfAText	Adobe PDF/A format with text only. Optimized for the long-term archiving of electronic documents and is based on the PDF Reference Version 1.4 from Adobe Systems Inc. (implemented in Adobe Acrobat 5). Note: Use DocumentFormat.Pdf instead.

Example

For an example, refer to IOcrDocument, IOcrDocumentManager and IOcrEngine.

Remarks

(Deprecated) All formats supported by Leadtools.Forms.DocumentWriters can be used from OCR now. For a list of the formats supported by LEADTOOLS OCR, refer to DocumentFormat. To get the engine native formats (if any), use GetEngineSupportedFormats.

The IOcrDocument interface contains the IOcrDocument.Save methods which allow you to save the recognized pages data to a final document format such as PDF, DOC and HTML (or XML through IOcrDocument.SaveXml).

Not all of the formats are supported by an IOcrEngine. To get the formats supported by a particular engine, use the IOcrDocumentManager.GetSupportedFormats or IOcrDocumentManager.IsFormatSupported methods.

To get the file extension for a OcrDocumentFormat, use IOcrDocumentManager.GetFormatFileExtension.

To get the friendly name of a OcrDocumentFormat, use IOcrDocumentManager.GetFormatFriendlyName.

Some of the document formats requires a special key to unlock. When using these formats you have to first unlock the specified support using the RasterSupport class.

The following table lists the document formats and the support type required to be unlocked before using them:

Document Format Support Type

Pdf, PdfImage, PdfText, PdfEdited and PdfWithImageSubstitutes RasterSupportType.OcrPlusPdfOutput when using the OcrEngineType.Plus engine, RasterSupportType.OcrProfessionalPdfOutput when using the OcrEngineType.Professional engine and RasterSupportType.OcrAdvantagePdfLeadOutput when using the OcrEngineType.Advantage engine

PdfA and PdfAText RasterSupportType.OcrPlusPdfLeadOutput when using the OcrEngineType.Plus engine, RasterSupportType.OcrProfessionalPdfLeadOutput when using the OcrEngineType.Professional engine and RasterSupportType.OcrAdvantagePdfLeadOutput when using the OcrEngineType.Advantage engine

Document Format	Support Type
Pdf, PdfImage, PdfText, PdfEdited and PdfWithImageSubstitutes	RasterSupportType.OcrPlusPdfOutput when using the OcrEngineType.Plus engine, RasterSupportType.OcrProfessionalPdfOutput when using the OcrEngineType.Professional engine and RasterSupportType.OcrAdvantagePdfLeadOutput when using the OcrEngineType.Advantage engine
PdfA and PdfAText	RasterSupportType.OcrPlusPdfLeadOutput when using the OcrEngineType.Plus engine, RasterSupportType.OcrProfessionalPdfLeadOutput when using the OcrEngineType.Professional engine and RasterSupportType.OcrAdvantagePdfLeadOutput when using the OcrEngineType.Advantage engine

Inheritance Hierarchy

System.Object
   System.ValueType
      System.Enum
         Leadtools.Forms.Ocr.OcrDocumentFormat

Requirements

Target Platforms: Microsoft .NET Framework 3.0, Windows XP, Windows Server 2003 family, Windows Server 2008 family

Leadtools.Forms.Ocr	Requires Document/Medical product license \| Send comments on this topic. \| Back to Introduction - All Topics \| Help Version 16.5.9.25
OcrDocumentFormat Enumeration
See Also