typedef struct _DOCWRTALTOXMLOPTIONS
{
DOCWRTOPTIONS Options;
DOCWRTALTOXMLMEASUREMENTUNIT MeasurementUnit; // Default = DOCWRTALTOXMLMEASUREMENTUNIT_MM10
L_TCHAR *FileName; // Optional
L_TCHAR *ProcessingDateTime; // Optional
L_TCHAR *ProcessingAgency; // Optional
L_TCHAR *ProcessingStepDescription; // Optional
L_TCHAR *ProcessingStepSettings; // Optional
L_TCHAR *SoftwareCreator; // Optional
L_TCHAR *SoftwareName; // Optional
L_TCHAR *SoftwareVersion; // Optional
L_TCHAR *ApplicationDescription; // Optional
L_INT FirstPhysicalPageNumber; // Default = 1
L_BOOL Formatted; // Default = L_FALSE (output formatted XML, if L_TRUE, Indentation is used)
L_TCHAR Indentation[80]; // Default = " "
L_UINT uFlags; // Default = 0. One or more DOCWRT_ALTOXML_xxx flags (eg: DOCWRT_ALTOXML_Sort)
L_INT nDesiredVersion; // Default = 4. Only 4 is supported at the moment
} DOCWRTALTOXMLOPTIONS, *pDOCWRTALTOXMLOPTIONS;
The DOCWRTALTOXMLOPTIONS structure provides information about an Analyzed Layout and Text Object format (ALTO XML).
Options structure containing options for the ALTO XML format.
The measurement unit to use. The default value is DOCWRTALTOXMLMEASUREMENTUNIT_MM10.
Optional string containing the file name.
Optional string containing the processing date/time.
Optional string containing the processing agency.
Optional string containing the processing step description.
Optional string containing the processing step settings.
Optional string containing the software creator.
Optional string containing the software name.
Optional string containing the software version.
Optional string containing the application description.
The first physical page number. Default = 1.
TRUE to output formatted XML using the value of Indentation
. Default = FALSE.
String containing the values to be used for indentation when Formatted
is TRUE. Default is " "
.
Optional flags parameter that can contain several of the values listed below. The flags can be combined using the bitwise OR operation (|). Default = 0. Possible values are:
Value | Meaning |
---|---|
DOCWRT_ALTOXML_Sort | [0x00000001] If set, the text will be sorted from top-left to bottom-right; otherwise, the text will be saved to the output file in the same order as the input data. |
DOCWRT_ALTOXML_PlainText | [0x00000002] If set, the font information will be discarded and the text will be written without any font style. |
DOCWRT_ALTOXML_ShowGlyphInfo | [0x00000004] If set, extra information is displayed for each glyph (position, bounding rectangle). |
DOCWRT_ALTOXML_ShowGlyphVariants | [0x00000008] If set, text from OCR will display variants for some glyphs. This options is used only when the input comes from an OCR operation. This flag implies DOCWRT_ALTOXML_ShowGlyphInfo. |
Can be used to specify that the output should conform to a particular version of the AltoXML specification. The only supported value at the moment is 4.
pDOCWRTALTOXMLOPTIONS is a pointer to a DOCWRTALTOXMLOPTIONS structure. Generally, where a function parameter type is pDOCWRTALTOXMLOPTIONS, you can declare a DOCWRTALTOXMLOPTIONS variable, update the structure's fields, and pass the variable's address in the parameter. Declaring a pDOCWRTALTOXMLOPTIONS variable is necessary only if your program requires a pointer.
ALTO (Analyzed Layout and Text Object) is an open XML Schema developed by the Library of Congress for OCR text and layout information.
The LEADTOOLS Document Writers support creating ALTO documents. The following features are supported:
The uStructSize
at Options
structure should be set to the size of DOCWRTALTOXMLOPTIONS, Use the sizeof() operator to calculate this value.
The structure is used by:
Help Collections
Raster .NET | C API | C++ Class Library | HTML5 JavaScript
Document .NET | C API | C++ Class Library | HTML5 JavaScript
Medical .NET | C API | C++ Class Library | HTML5 JavaScript
Medical Web Viewer .NET
Multimedia
Direct Show .NET | C API | Filters
Media Foundation .NET | C API | Transforms
Supported Platforms
.NET, Java, Android, and iOS/macOS Assemblies
Imaging, Medical, and Document
C API/C++ Class Libraries
Imaging, Medical, and Document
HTML5 JavaScript Libraries
Imaging, Medical, and Document