User Guide > How to Work with... > OCR > Concepts > Technical Specifications > Code Page List |
The following table summarizes all the Code Pages supported by the Engine. One of these must be specified as the Code Page of the final output document.
CODE PAGE NAME |
WIDTH |
DESCRIPTION |
IMPLEMENTATION |
Windows ANSI |
8-bit |
Code Page 1252 |
Hard-coded (Default) |
Windows Greek |
8-bit |
Code Page 1253 |
Hard-coded |
Windows Eastern |
8-bit |
Code Page 1250 |
Hard-coded |
Windows Turkish |
8-bit |
Code Page 1254 |
Hard-coded |
Windows Baltic |
8-bit |
Code Page 1257 |
Hard-coded |
Windows Cyrillic |
8-bit |
Code Page 1251 |
Hard-coded |
Windows Esperant |
8-bit |
Non Standard Win |
Derived from CP 1252 |
Code Page 437 |
8-bit |
DOS Latin US |
Hard-coded |
Greek-ELOT |
8-bit |
DOS Greek |
Hard-coded |
Greek-MEMOTEK |
8-bit |
DOS Greek |
Hard-coded |
Code Page 850 |
8-bit |
DOS Latin 1 |
Derived from CP 437 |
Code Page 852 |
8-bit |
DOS Latin 2 |
Derived from CP 437 |
Code Page 860 |
8-bit |
DOS Portuguese |
Derived from CP 437 |
Code Page 863 |
8-bit |
DOS French-Canadian |
Derived from CP 437 |
Code Page 865 |
8-bit |
DOS Nordic |
Derived from CP 437 |
Code Page 866 |
8-bit |
DOS Cyrillic CIS |
Derived from CP 437 |
CWI Magyar |
8-bit |
DOS Hungarian |
Derived from CP 437 |
Magyar Ventura |
8-bit |
DOS Hungarian |
Derived from CP 437 |
IVKAM C-S |
8-bit |
Czech & Slovak |
Derived from CP 437 |
Mazowia Polish |
8-bit |
DOS Polish |
Derived from CP 437 |
Sloven & Croat |
8-bit |
7 bits used |
Derived from CP 437 |
Turkish |
8-bit |
DOS Turkish |
Derived from CP 437 |
Icelandic |
8-bit |
DOS Icelandic |
Derived from CP 437 |
Macintosh |
8-bit |
Mac Western |
Hard-coded |
Mac INSO Latin 2 |
8-bit |
MAC CE |
Hard-coded |
Mac Central EU |
8-bit |
PT 202 |
Hard-coded |
Mac Primus CE u |
8-bit |
MAC CE |
Hard-coded |
Maltese |
8-bit |
Malta; 7 bits used |
Derived from CP 437 |
OCR |
8-bit |
Non Standard Win |
Derived from CP 437 |
Unicode |
16-bit |
multilingual |
Hard-coded |
WordPerfect |
16-bit |
multilingual |
Hard-coded |
WordPerfect Old |
16-bit |
multilingual |
Hard-coded |
Roman 8 |
8-bit |
For HP printers |
Hard-coded |
UTF-8 |
16-bit |
multilingual |
Hard-coded |
For programming, the current Code Page setting of the Engine can be set or inquired by ImGearRecOutputManager.CodePage Property value. The exact list of available Code Pages can be inquired by ImGearRecOutputManager.CodePages Property value.
The Windows ANSI name here is not connected with the real ANSI character set (ISO 8859-1). In fact, this is the code page 1252 (Windows CP 1252). Details of the WordPerfect code pages:
|
This section provides information about the following:
If no offered Code Page fulfills your needs, you can develop your own derivative 8-bit Code Page by adding a new Code Page Definition file to the Engine Binary directory. This new file should have a .SET
file extension and it should contain a separate section for your custom Code Page. Five Code Pages are available as the basis for customized Code Pages:
The characters belonging to the custom Code Page should be given in UNICODE and follow the other layout conventions found in the RECOGN.SET
basic Code Page Definition file.
In some cases, the Code Page setting of the Engine must be specified together with the Output Text Format for the final output document. With other output formats, specifying the Code Page is superfluous, since these output converters ignore the Code Page setting (e.g., MS Word).