ImageGear .NET
MOR Multi-Lingual Omnifont Recognition Module

Module name:

MOR

Module identifier:

OMNIFONT_MOR

Filling methods supported:

OMNIFONT, DRAFTDOT24, OCRA, OCRB

Filters supported:

all filter elements

Trade-off supported:

FAST, BALANCED, ACCURATE

Knowledge base files:

RECOGN.BCT and RECOGN24.BCT

The PLUS2W and PLUS3W recognition modules also require the presence of this module.

Application Areas

This module recognizes machine printed text; i.e., from printed publications, laser or ink-jet printers and electric typewriters. Output from mechanical typewriters in good condition may also be acceptable. It could also be used for letter or near letter quality (NLQ, LQ) output from dot-matrix printers. For Draft quality 24-pin dot-matrix documents use the DRAFTDOT24 filling method. NLQ or LQ quality output can usually be better recognized without using DRAFTDOT24.

The max. number of zones defined on an image that this module can handle is 500.

Range of Characters

This module can recognize about 500 characters, termed Engine’s Total Character Set. It includes the letters of the Latin, Greek and Cyrillic alphabets with enough accented letters to recognize the 119 Languages supported by the Engine

The set is classified as follows:

Non-accented

Accented

Latin alphabet upper case letters

26

89

Latin alphabet lower case letters

26

91

Digits

10

Punctuation

29

Miscellaneous (math symbols, etc.)

55

Cyrillic upper case letters

33

14

Cyrillic lower case letters

33

14

Greek upper case letters

24

9

Greek lower case letters

25

11

OCR (OCR-A) characters

3

The characters are listed in category and alphanumeric order, together with their Code Page values, in Characters and Code Pages. These are the character categories used by the filter elements.

Character Attributes

The omnifont recognition module can detect and transmit character attributes: bold, italic or underlined text (or any combination of them). It can also detect and transmit character size, and can classify font types into three broad categories: serif, sans serif and monospaced.

Speed/Accuracy Choices

The multi-lingual omnifont recognition module basically uses contour analysis, but can supplement this with an innovative form of pattern matching not requiring enormous pre-stored shape libraries.

This module interprets all three page-level recognition trade-off settings: ACCURATE, BALANCED and FAST.

The module is tightly integrated with the checking module, giving a total of five speed/accuracy choices.

 

 

 


©2016. Accusoft Corporation. All Rights Reserved.

Send Feedback