Define and Edit Character Sets
Define Character Sets
SmartZone ICR/OCR allow one or more character sets to be defined using the CharacterSet class (ICR: CharacterSet; OCR: CharacterSet). SmartZone ICR/OCR provide Add (ICR: Add; OCR: Add) and Remove (ICR: Remove; OCR: Remove) methods to add or remove single or multiple characters in a string and/or predefined character sets to your current character set collection, creating subsets as needed.
To see if a character is in the current Character Set, use the Contains method (ICR: Contains; OCR: Contains).
Language Support for Character Sets
You have the choice of multiple language support. For supported languages and their characters, see the AllCharacters property (ICR: AllCharacters; OCR: AllCharacters). Other predefined character sets are listed in the next section.
Set the Language property (ICR: Language; OCR: Language) to refine the contents of any character set containing alphabetic entries. For example, specifying a language of Italian and a character set of AlphaNumeric would limit the returned results to only letters included in the Italian alphabet plus digits.
Improve Recognition Accuracy
For best recognition accuracy results, set the character set to the narrowest set possible, including all possible returned values, then limit any possible returns by applying predefined character sets listed here. Character sets are used to limit (reduce) possible returned values once a universe of possible returned values is defined.
For example, since é is not included in the English language, in order to accurately read the word "Résumé", you could either:
- specify a language that includes é, such as French, since that language includes English letters plus é.
- specify the language English and add the letter é.
You could improve recognition further by omitting any other characters you do not expect to encounter.
Predefined Character Sets
There are 12 additional predefined character sets available as properties:
Edit Character Sets
Optimal recognition results are obtained by using the character set that includes all and only the characters that potentially are encountered.
The ability to modify character sets into subsets to increase accuracy, confidence and speed is available in SmartZone using the following methods:
- Add(String) (ICR: Add(String); OCR: Add(String))
- Add(CharacterSet) (ICR: Add(CharacterSet); OCR: Add(CharacterSet))
- Remove(String) (ICR: Remove(String); OCR: Remove(String))
- Remove(CharacterSet) (ICR: Remove(CharacterSet); OCR: Remove(CharacterSet))
(SmartZone OCR Only) Character sets can be customized to include supported characters from any supported language, except:
- Greek, Chinese, Japanese, and Korean languages cannot be combined with other languages. English language characters are predefined and already included in these character sets.
- Thai and Vietnamese characters can only be combined with English language characters.