The ImageGear.Recognition Namespace API enables you to build OCR applications for the Windows .NET development environment. An application using the Recognition API can do the following:
- Accept as input any image file format supported by ImageGear. This image data can be binary, gray, or color.
- Recognize a document image and export text in any format defined by the recognition engine. These formats include 8-bit text, HTML, XML and Word processor formats such as MS Word or WordPerfect.
- Select the language or languages of documents to be recognized. The list includes English, Eastern and Western European languages, Asian languages, Cyrillic based languages (Russian), the Baltic languages, Turkish, and Greek. Documents with multiple languages can be recognized with accuracy because the API allows the application to specify the set of languages for recognition.
- Enable end users to verify text during the recognition process.
- Increase recognition accuracy with built-in and user-defined dictionaries.
- Output confidence values for post-recognition processing.
- Automatically segment the page to correctly recognize text on pages with complex or irregular layouts, including tables, reverse video, and line art as well.
- Allow the user to manage delineate zones of a document page and then specify treatment for those zones. This includes the ability to correct the OCR engine's automatic segmentation between the segmentation phase and the recognition phase.
- Process both text and graphics. The recognition software's ability to distinguish graphics from text can provide the basis of a compound document processing system.
- Automatically detect fax, dot matrix, and other degraded documents and compensate accordingly.
- Use a scalable voting architecture that provides developers with 2 pre-made voting interfaces (OmniPage and PLUS) and direct access to 3 leading OCR engines (MOR, MTX, and FireWorX).
- Recognize handwritten text using a numbers only module or alphanumeric module.