OCR Overview
OCR (Optical Character Recognition) is the process of converting machine printed information into editable text. The SmartZone OCR component used by FormAssist performs this process by using zones or selected field areas on the form template, with pre-defined character sets to select from, to analyze the filled-in form fields.
Mouse over areas of interest on the image for further explanation and links. |
FormAssist window with OCR tab open and an OCR field highlighted
The image above displays an example OCR field highlighted (Your Name) on both the Tree and Image Views.
-
The Tree View displays the name and icon.
-
The Image View outlines the field area currently selected.
-
The Properties View displays all the OCR Field Properties for the field type which can be modified.
Because the OCR engine is analyzing for specific text on images, it's recommended to remove or 'dropout' the form and perform any image enhancements to improve the OCR processing performance.
See the Image Enhancement topic in this section for more information on how to improve recognition processing performance. To create OCR fields, see the OCR Fields topic below the Define and Create Fields section. |
Properties View
The tabs of the Properties View are:
Tab | Description |
General | This tab contains the field area coordinates which can be modified by either dragging the outlined field in the Image View or by modifying the values in the Properties View on the General tab. |
Dropout | Dropout is important because it helps the OCR engine to accurately determine the machine printed text from part of the original form. Dropout is recommended whenever possible. See the Dropout Properties topic for more information. |
ScanFix Xpress | The ScanFix Xpress settings are recommended as they improve the image, allowing the OCR engine to recognize the text with greater accuracy. See the ScanFix Xpress Properties topic for more details. |
OCR | The OCR property settings allow you to select the language, character set, confidence values, blob size, minimum text line height, rejection character, spaces, multiple text lines, split merged characters and split overlapping characters. Adjusting these settings for the OCR engine can improve performance and increase result accuracy. See the OCR Property Details below for more information. |