ImageGear for C and C++ on Windows v19.8 - Updated
Confidence Reporting
User Guide > How to Work with... > OCR > How to... > Assess and Analyze OCR Output > Confidence Reporting

For some applications, it may be important to know the reliability of the recognized text. These applications may require having additional confidence information for the recognized characters and/or words.

Confidence information can be retrieved directly into application memory by a call to the IG_REC_letters_get function, after issuing the IG_REC_image_recognize function call. The IG_REC_letters_get function provides the most detailed information about the recognized data: it results in a AT_REC_LETTER structure for each recognized character.

The AT_REC_LETTER structure provides character recognition confidence information via its Confidence field, which ranges between 0 and 100. A value of 100 means that the Engine recognized the character with high confidence. In some cases a word may have some or all characters that are individually suspicious but the characters are not marked as suspicious in the word bit. This is usually a result of language or user dictionary checking. It means the word was validated by the spelling module. Applications that examine the character confidence information can use a threshold value, above which the character value is treated as a suspicious result. A value of 36 is recommended for this purpose. A value greater than 36 will indicate that the character was recognized with high confidence. A value of 36 or less marks that the code is suspected. The confidence reporting system works best when all three recognition modules are used in the voting scheme (IG_REC_RM_OMNIFONT_PLUS3W recognition module). If other machine print recognition modules are used IG_REC_RM_OMNIFONT_PLUS2W, IG_REC_RM_OMNIFONT_MTX, etc.) the confidence information is still available, but the ability of the system to properly report confidence will be reduced. This will result in a higher level of false negative and false positive reporting of suspected recognition results.

Also, the AT_REC_LETTER structure has the WordUncertain field, which indicates whether the word can be considered as certain or not. The word is uncertain if this value is TRUE.