In a typical form processing system, your goal is to take images of various forms, recognize them, extract data from them and save that data in a database.
FormAssist uses the following Accusoft components in this scenario:
- FormDirector to define the forms and fields and create form sets.
- FormFix to identify the images, drop out the form and perform OMR.
- ImagXpress for advanced image editing and clean up.
- NotateXpress for image annotation capabilities.
- ScanFix Xpress to perform some image clean-up at certain stages of the operation.
- SmartZone ICR to read hand-printed data.
- SmartZone OCR to read machine-printed data.
- Barcode Xpress to recognize barcodes.
- PDF Xpress to use PDFs as template images and filled form images.
Identification
Form recognition and identification is a very important part of form processing to achieve the best accuracy in text recognition and speed.
In some cases, you may want to use a different method to identify your forms for better recognition and identification. For example, if you control the layout of your forms, you may be able to get faster and more reliable recognition by placing a different barcode on each type of form. Even in this case, you will probably still want to use the Form Set Identify Properties settings since they also precisely register your images. Simply use the provided mechanism to limit identification to the single, correct form template. Consider using the FormFix RegistrationProcessor for forms which have registration marks and a single template to compare against. It can boost recognition speed and reliability.
See the Form Set Identify Properties topic for detailed information on setting form set Identify properties.
Dropout
Dropout is important as it increases recognition and identification by limiting markings on the forms. There are alternates to using the FormAssist dropout process, which are better in some cases as listed below.
Many forms-processing applications achieve very good results using a dropout bulb on each scanner. You must match the color of the bulb to the color of the form and you must be careful not to allow your user's to enter filled data using the drop-out color. Although the dropout performance is excellent, identification and registration can be quite difficult unless you can use a barcode or other technique for identification and some form of registration marks for registration. The FormFix engine used by FormAssist has functionality to help you register your images using registration marks.
See the Dropout Properties topic for detailed information on Dropout settings and adjustments.
Use ScanFix Xpress in place of Dropout
ScanFix Xpress has line and comb removal technology that you can use in place of dropout in some cases. Unlike using color dropout bulbs on a scanner, you can select between using ScanFix Xpress properties and the FormFix dropout properties on a field-by-field basis to get the best performance in every case.
The most common uses of the ScanFix Xpress properties are to deskew images before identification and to remove noise on field clips before extracting data from them.
See the Form Set ScanFix Xpress Properties topic for more details on these properties and how to adjust them.
Single-Form Processing
A common simplification of general forms processing is a similar system that accepts only a single form. This is common in very low volume applications where a user scans and interactively processes a batch of pages with a consistent form. It is also common in a few high-volume applications, when a company sends out a single form to be filled out and returned for processing.
Although this scenario differs from the one above because form identification is not necessary, you can treat it the same as general-purpose form processing with an alternate form identification method, such as barcode.
Image Sorting
In an image sorting application, your goal is to take images of various forms, sort them by type, then archive the result.
Typically, little or no data is captured in this type of system.
FormAssist uses these components in this scenario:
- FormDirector to define the set of candidate forms.
- FormFix to identify the images.
- ScanFix Xpress to perform some image clean-up.
If you want to align your archived images to match your template images, you should use the form set Identify properties used by the FormFix engine to align them before processing them.
High Volume Forms Processing
The FormAssist application is a demonstration of how to setup, test, and perform forms processing using a GUI.
A sample program provided with FormSuite is the Forms Processing Server, which demonstrates an alternative, high volume method for processing forms. The Forms Processing Server is a console application designed for unattended batch processing of forms. See the help on the Forms API for more details on the Forms Processing Server.