This is a recognized data structure. As the result of the recognition process, the recognition data will consist of this type of structure for each recognized character. This is the most detailed information available about the recognized characters.
Copy Code
|
|
---|---|
typedef struct tagAT_REC_LETTER
{
AT_DWORD AlternativeCodeOffset;
AT_BYTE AlternativeCodesCount;
enumIGRecLanguages AlternativeLanguage;
AT_BYTE BackgroundColorIndex;
AT_WORD Baseline;
AT_WORD CapitalLetterHeight;
AT_WCHAR Code;
AT_BYTE Confidence;
enumIGRecLetterExtraInfo ExtraInfo;
enumIGRecFontType FontAttribute;
AT_BYTE ForegroundColorIndex;
enumIGRecLanguages Language;
enumIGRecMakeupInfo Makeup;
AT_RECTANGLE Rect;
AT_WORD SpacesCount;
enumIGRecSpaceType SpaceType;
AT_BYTE UnderlineDotWidth;
AT_BYTE UnderlineGapWidth;
AT_DWORD WordSuggestionOffset;
AT_BYTE WordSuggestionsCount;
AT_BOOL WordUncertain;
AT_WORD ZoneIndex;
} AT_REC_LETTER, * LPAT_REC_LETTER;
|
Name | Type | Description |
---|---|---|
AlternativeCodeOffset | AT_DWORD | Index of the first alternative character code for this AT_REC_LETTER in an external array of alternative codes. Use IG_REC_alternative_letter_codes_get function to obtain the array of alternative codes for the recognized page to which this AT_REC_LETTER belongs. |
AlternativeCodesCount | AT_BYTE | Number of alternative character codes to the primary choice available from the Code field. Use IG_REC_alternative_letter_codes_get for convenient access to alternative codes. Use AlternativeCodeOffset field for fast low level access to alternative codes. |
AlternativeLanguage | enumIGRecLanguages | See field Language. |
BackgroundColorIndex | AT_BYTE | Index of the background color within the palette of the recognition data. |
Baseline | AT_WORD | Y coordinate of the baseline in pixels. |
CapitalLetterHeight | AT_WORD | Expresses a measure of the capital letter height in pixels. Rough approximation: Font Size = CapitalLetterHeight * 100 / dpi. |
Code | AT_WCHAR | Character code in UNICODE. This is the first choice of the recognition. |
Confidence | AT_BYTE | Certainty of the recognition of the character, which ranges between 0 and 100. A value of 100 means that the Engine recognized the character with high confidence. In some cases a word may have some or all characters that are individually suspicious but the characters are not marked as suspicious in the word bit. This is usually a result of language or user dictionary checking. It means the word was validated by the spelling module. Applications that examine the character confidence information can use a threshold value above which the character value is treated as a suspicious result. A value of 36 is recommended for this purpose. A value greater than 36 will indicate that the character was recognized with high confidence. A value of 36 or less marks that the code is suspected. The confidence reporting system works best when all three recognition modules are used in the voting scheme (IG_REC_RM_OMNIFONT_PLUS3W recognition module). If other machine print recognition modules are used IG_REC_RM_OMNIFONT_PLUS2W, IG_REC_RM_OMNIFONT_MTX, etc.) the confidence information is still available, but the ability of the system to properly report confidence will be reduced. This will result in a higher level of false negative and false positive reporting of suspected recognition results. |
ExtraInfo | enumIGRecLetterExtraInfo | Bit mask containing additional information about the character. See enumIGRecLetterExtraInfo for possible values. Bit fields not listed in enumIGRecLetterExtraInfo are used internally. |
FontAttribute | enumIGRecFontType | Font information about the recognized character. |
ForegroundColorIndex | AT_BYTE | Index of the foreground color within the palette of the recognition data. |
Language | enumIGRecLanguages | This is used to declare the language to which the recognized word belongs. When the recognized word cannot be associated with any language, it is signaled with enumIGRecLanguages.IG_REC_LANG_NO. If the recognized word can also be found in another language dictionary, then both lang and lang2 fields will contain the language IDs of those languages. |
Makeup | enumIGRecMakeupInfo | Since the recognition data does not contain extra characters for marking the ends of lines, paragraphs, pages, etc., these items of information are stored for the particular characters in the this field. It can be any binary OR-ed combination of the enumIGRecMakeupInfo flags. |
Rect | AT_RECTANGLE | Boundary rectangle exactly containing the character in pixels. |
SpacesCount | AT_WORD | Number of spaces. Makes sense only if Code is a space. |
SpaceType | enumIGRecSpaceType | Space type. Makes sense only if Code is a space. |
UnderlineDotWidth | AT_BYTE | Width of a dot in pixels if the "underline" is a dotted underline. 0 if simple underline, 0 if nothing. It also gives this information if the Code is a space, and SpaceType is enumIGRecSpaceType.IG_REC_SPC_LEADERDOT. |
UnderlineGapWidth | AT_BYTE | Width of a gap in pixels if the "underline" is a dotted underline. 0 if simple underline, 0 if nothing. It also gives this information if the Code is a space, and SpaceType is enumIGRecSpaceType.IG_REC_SPC_LEADERDOT. |
WordSuggestionOffset | AT_DWORD | Index of the first word suggestion for this AT_REC_LETTER in an external array of word suggestions. Makes sense only if Code is not a space. Use IG_REC_word_suggestions_get function to obtain the array of word suggestions for the recognized page to which this AT_REC_LETTER belongs. |
WordSuggestionsCount | AT_BYTE | Number of word suggestions for the current word to which this AT_REC_LETTER belongs. Makes sense only if Code is not a space. Use IG_REC_word_suggestions_get for convenient access to word suggestions. Use WordSuggestionOffset field for fast low level access to word suggestions. |
WordUncertain | AT_BOOL | Certainty/uncertainty of the word. The word is uncertain if this value is TRUE. |
ZoneIndex | AT_WORD | Index of the zone in the zone list which contains the character. |