RECOGCHARS

typedef struct _tagRecogChars
{
   L_UINT uStructSize;
   RECT rcArea;
   L_INT nYOffset;
   L_INT nSpace;
   L_INT nSpaceErr;
   L_WCHAR wGuessCode;
   L_WCHAR wGuessCode2;
   L_WCHAR wGuessCode3;
   L_INT nZoneIndex;
   L_INT nCellIndex;
   L_INT nConfidence;
   L_UINT uFont;
   L_INT nFontSize;
   L_INT nCharFormat;
   LANGIDS Lang;
   LANGIDS Lang2;
} RECOGCHARS, * pRECOGCHARS;

The RECOGCHARS structure contains information about the recognized characters.

Member

Description

uStructSize

Specifies the structure size. It should be equal to sizeof(RECOGCHARS).

rcArea

RECT structure that contains the area for the recognized character.

nYOffset

Y coordinate of the baseline measured from the top edge of the rectangle exactly containing the character.

nSpace

Value that represents the number of spaces in front of the recognized code character.

nSpaceErr

Confidence number expressing the certainty of the space parameter.

wGuessCode

Character code in UNICODE. This is either the first guess of the recognition or zero (0) signaling that the engine could not recognize the character (rejected character).

wGuessCode2

Second guess of the recognition, if any.

wGuessCode3

Third guess of the recognition, if any.

nZoneIndex

Index of the zone in the zone list which contains the character.

nCellIndex

Index of the cell in the cell list which contains the character (applicable only for ZONE_TABLE zones). The cell list is not accessible for the application.

nConfidence

Confidence number expressing both the certainty of the recognition of the first guess (code member) and also the certainty of the word.

uFont

Font information about the recognized character. Possible values are given below. These values can be combined using OR (|)

 

Value

Meaning

 

FONT_ITALIC

[0x001] The character is italic.

 

FONT_BOLD

[0x002] The character is bold.

 

FONT_UNDERLINE

[0x004] The character is underlined.

 

FONT_SUBSCRIPT

[0x008] The character is subscript.

 

FONT_SUPERSCRIPT

[0x010] The character is superscript.

 

FONT_SANSSERIF

[0x020] The character is Sans Serif.

 

FONT_SERIF

[0x040] The character is Serif.

 

FONT_PROPORTIONAL

[0x080] The character is proportional.

nFontSize

Font size in points.

nCharFormat

Formatting attributes of the character. Possible values are given below. These values can be combined using OR (|)

 

Value

Meaning

 

CHAR_ENDOFLINE

[0x001] This is the last character in a line.

 

CHAR_ENDOFPARA

[0x002] This is the last character in a paragraph.

 

CHAR_ENDOFWORD

[0x004] This is the last character of a word.

 

CHAR_ENDOFZONE

[0x008] This is the last character in a zone.

 

CHAR_ENDOFPAGE

[0x010] This is the last character on a page.

 

CHAR_ENDOFCELL

[0x020] This is the last character in a cell.

 

 

(applicable only for ZONE_TABLE type zones).

Lang

Value that represents the first language in which the recognized word is found. For a list of possible values, refer to LANGIDS.

Lang2

Value that represents the second language in which the recognized word is found.

Comments

The application should evaluate the nConfidence member, when confidence information on the recognition is also required. Its most significant bit is used to express the certainty/uncertainty of the word (word is uncertain, if this bit set to one (1). The remaining bits represent the certainty of the character recognition, which ranges between 0 and 100. The smaller this value is, the higher is the confidence of the recognition. For more details, see the Confidence reporting topic.

pRECOGCHARS is a pointer to a RECOGCHARS structure. Where the function parameter type is pRECOGCHARS, declare a RECOGCHARS variable, update the structure's fields, and pass the variable's address in the parameter. Declaring a pRECOGCHARS variable is necessary only if the program requires a pointer.

If the recognition process cannot associate the current recognized word with any language, then Lang is updated with LANG_ID_NO.

If the recognized word can be found in more than one language, then Lang will be updated with the ID of the first language in which the recognized word was found and Lang2 will be updated with the second language in which the word was found.

This structure is used with the following functions:

L_DocGetRecognizedCharacters

L_DocSetRecognizedCharacters

L_DocFreeRecognizedCharacters