SmartZone v6.0 for .NET - Updated
Define and Edit Character Sets
How To > Define and Edit Character Sets

Define Character Sets

SmartZone ICR/OCR allow one or more character sets to be defined using the CharacterSet class (ICR: CharacterSet; OCR: CharacterSet). SmartZone ICR/OCR provide Add (ICR: Add; OCR: Add) and Remove (ICR: Remove; OCR: Remove) methods to add or remove single or multiple characters in a string and/or predefined character sets to your current character set collection, creating subsets as needed.

To see if a character is in the current Character Set, use the Contains method (ICR: Contains; OCR: Contains).

Language Support for Character Sets

If you have specified a language, it will be used to refine the contents of any character set containing alphabetic entries. For example, specifying a language of Italian and a character set of AlphaNumeric would limit the returned results to only letters included in the Italian alphabet plus digits.

You have the choice of multiple language support including the following:

English

Characters
ICR !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ\_abcdefghijklmnopqrstuvwxyz{|}¢£¥€
OCR !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~¢£¤¥¦§©«¬­®°±²³´·¹º»¼½¾–—‘’‚“”„†‡•…‰‹›€™

French

Characters
ICR !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ\_abcdefghijklmnopqrstuvwxyz{|}¢£¥«»ÀÇÈÉÊËÎÏÔÙÛÜàâçèéêëîïôùûü€
OCR !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~¢£¤¥¦§©«¬­®°±²³´·¹º»¼½¾ÀÂÇÈÉÊËÎÏÔÙÛàâçèéêëîïôùûŒœ–—‘’‚“”„†‡•…‰‹›€™

Spanish

Characters
ICR !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ\_abcdefghijklmnopqrstuvwxyz{|}¡¢£¥«»¿ÁÉÍÑÓÚÜáéíñóúü€
OCR !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~¡¢£¤¥¦§©ª«¬­®°±²³´·¹º»¼½¾¿ÁÇÈÉÍÑÒÓÚÜáçèéíñòóúü–—‘’‚“”„†‡•…‰‹›€™

Italian

Characters
ICR !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ\_abcdefghijklmnopqrstuvwxyz{|}¢£¥«»ÀÈÉÌÒÙàèéìòù€
OCR !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~¢£¤¥¦§©«¬­®°±²³´·¹º»¼½¾ÀÈÉÌÒÓÙàèéìòóù–—‘’‚“”„†‡•…‰‹›€™

German

Characters
ICR !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ\_abcdefghijklmnopqrstuvwxyz{|}¢£¥«»ÄÖÜßäöü„€
OCR !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~¢£¤¥¦§©«¬­®°±²³´·¹º»¼½¾ÄÉÖ×Üßäéö÷ü–—‘’‚“”„†‡•…‰‹›€™

Dutch

Characters
ICR !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ\_abcdefghijklmnopqrstuvwxyz{|}¢£¥ÀÁÄ ÇÈÉÊËÌÍÎÏÑÒÓÖÙÚÛÜàáâäçèéêëìíîïñòóöùúü€
OCR !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~¢£¤¥¦§©«¬­®°±²³´·¹º»¼½¾ÈÉÊËÏÖÜèéêëïöüÿŸ–—‘’‚“”„†‡•…‰‹›€™

Portuguese

Characters
ICR !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ\_abcdefghijklmnopqrstuvwxyz{|}¢£¥ÀÁ ÇÈÉÊÍÒÓÔÕÚÜàáâçèéêíòóôõúü€
OCR !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~¢£¤¥¦§©«¬­®°±²³´·¹º»¼½¾ÀÁÂÃÇÉÊÍÒÓÔÕÚÜàáâãçéêíòóôõúü–—‘’‚“”„†‡•…‰‹›€™

Norwegian

Characters
ICR !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ\_abcdefghijklmnopqrstuvwxyz{|}¢£¥«»ÅÆØåæø€
OCR !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~¢£¤¥¦§©«¬­®°±²³´·¹º»¼½¾ÅÆÉØåæéø–—‘’‚“”„†‡•…‰‹›€™

Finnish

Characters
ICR !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ\_abcdefghijklmnopqrstuvwxyz{|}¢£¥«»ÄÅÖäåö€
OCR !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~¢£¤¥¦§©«¬­®°±²³´·¹º»¼½¾ÄÅÖäåö–—‘’‚“”„†‡•…‰‹›€™

Danish

Characters
ICR !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ\_abcdefghijklmnopqrstuvwxyz{|}¢£¥«»ÅÆØåæø„€
OCR !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~¢£¤¥¦§©«¬­®°±²³´·¹º»¼½¾ÅÆÉØåæéø–—‘’‚“”„†‡•…‰‹›€™

Swedish

Characters
ICR !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ\_abcdefghijklmnopqrstuvwxyz{|}¢£¥«»ÄÅÖäåö€
OCR !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~¢£¤¥¦§©«¬­®°±²³´·¹º»¼½¾ÄÅÉÖäåéö–—‘’‚“”„†‡•…‰‹›€™

Western European

Characters
ICR !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ\_abcdefghijklmnopqrstuvwxyz{|}¡¢£¥«»¿ÀÁÄÅÆÇÈÉÊËÌÍÎÏÑÒÓÔÕÖØÙÚÛÜßàáâäåæçèéêëìíîïñòóôõöøùúûü„€
OCR !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~  ¡¢£¤¥¦§¨©ª«¬­®¯°±²³´µ¶·¸¹º»¼½¾¿ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿŒœŠšŸŽžƒˆ˜–—‘’‚“”„†‡•…‰‹›€™

For best recognition accuracy results, set the character set to the narrowest set possible, including all possible returned values, then limit any possible returns by applying predefined character sets listed here. Character sets are used to limit (reduce) possible returned values once a universe of possible returned values is defined.

For example, since é is not included in the English language, in order to accurately read the word "Résumé", you could either:

You could improve recognition further by omitting any other characters you do not expect to encounter.

Predefined Character Sets

There are 12 additional predefined character sets available as properties:

AllAlphas

Includes all upper and lower case alpha characters.

Character Set Characters
ICR: AllAlphas ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyzÀÁ ÃÄÅÆÇÈÉÊËÌÍÎÏÑÒÓÔÕÖØÙÚÛÜßàáâãäåæçèéêëìíîïñòóôõöøùúûü
OCR: AllAlphas ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyzªµºÀÁ ÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿŒœŠšŸŽžƒˆ

AllCharacters

Includes all upper and lower case alpha, all digits, punctuation, currency and arithmetic characters.

Note that [ and ] are omitted from ICR.

Character Set Characters
ICR: AllCharacters !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ\_abcdefghijklmnopqrstuvwxyz{|}¡¢£¥«»¿ÀÁÂÄÅÆÇÈÉÊËÌÍÎÏÑÒÓÔÕÖØÙÚÛÜßàáâäåæçèéêëìíîïñòóôõöøùúûü„€
OCR: AllCharacters !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~¡¢£¤¥¦§¨©ª«¬­®¯°±²³´µ¶·¸¹º»¼½¾¿ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿŒœŠšŸŽžƒˆ˜–—‘’‚“”„†‡•…‰‹›€™

AlphaNumeric

Includes all upper and lower case alpha and digit characters.

Character Set Characters
ICR: AlphaNumeric 0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyzÀÁ ÃÄÅÆÇÈÉÊËÌÍÎÏÑÒÓÔÕÖØÙÚÛÜßàáâãäåæçèéêëìíîïñòóôõöøùúûü
OCR: AlphaNumeric 0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyzª²³µ¹º¼½¾ÀÁ ÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿŒœŠšŸŽžƒˆ

Arithmetic

Includes all digits, arithmetic and arithmetic punctuation characters.

Character Set Characters
ICR: Arithmetic %()*+,-./0123456789<=>
OCR: Arithmetic %()*+,-./0123456789<=>|~¬±×÷

ArithmeticSymbols

Includes all arithmetic characters.

Character Set Characters
ICR: ArithmeticSymbols +<=>
OCR: ArithmeticSymbols +<=>|~¬±×÷

Currency

Includes all digits, currency and currency punctuation characters.

Character Set Characters
ICR: Currency $',-.0123456789=¢£¥€
OCR: Currency $',-.0123456789=¢£¤¥€

CurrencySymbols

Includes all currency characters.

Character Set Characters
ICR: CurrencySymbols $¢£¥€
OCR: CurrencySymbols $¢£¤¥€

Digits

Includes all digits characters.

Character Set Characters
ICR: Digits 0123456789
OCR: Digits 0123456789

LowerCase

Includes only lower case alpha characters.

Character Set Characters
ICR: LowerCase abcdefghijklmnopqrstuvwxyzßàáâãäåæçèéêëìíîïñòóôõöøùúûü
OCR: LowerCase abcdefghijklmnopqrstuvwxyzªµºßàáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿœšžƒ

PhoneNumber

Includes  only phone number characters.

The phone number's extension can be preceded by the individual characters x or X. You can also precede the extension with ext or EXT.

Character Set Characters
ICR: PhoneNumber ()+-./0123456789ETXetx
OCR: PhoneNumber ()+-./0123456789ETXetx

Punctuation

Includes only punctuation characters.

Note that [ and ] are omitted from SmartZone ICR.

Character Set Characters
ICR: Punctuation !"#%&'()*,-./:;?@\_{|}¡¿«»„
OCR: Punctuation !"#%&'()*,-./:;?@[\]^_`{}¡¦§¨©«­®¯°´¶·¸»¿˜–—‘’‚“”„†‡•…‰‹›™

UpperCase

Includes only upper case alpha characters.

Character Set Characters
ICR: UpperCase ABCDEFGHIJKLMNOPQRSTUVWXYZÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÑÒÓÔÕÖØÙÚÛÜ
OCR: UpperCase ABCDEFGHIJKLMNOPQRSTUVWXYZÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝÞŒŠŸŽ

Edit Character Sets

Optimal recognition results are obtained by using the character set that includes all and only the characters that potentially are encountered.

The ability to modify character sets into subsets to increase accuracy, confidence and speed is available in SmartZone using the following methods: