Drop-out is a process whereby the pre-printed content (template data) on an image is removed, leaving only the data that was added to a form (filled data). When filled and template data overlap, the filled data will be reconstructed as accurately as possible.
FormFix provides the capability to generate a clip of field in a filled form, and to optionally dropout the template from the image, leaving only the filled data in the resulting clip image.
The FormFix API also allows you to control whether the dropout process attempts to reconstruct areas of the filled image where the filled pixels overlapped with the template pixels. FormFix dropout requires that the image identification is known, and precise registration information can be provided; both of which are capabilities of FormFix’s identification processor.
There are alternates to using FormFix's drop-out algorithm, which are better in some cases.
Many forms processing applications achieve very good results using a drop-out bulb on each scanner. You must match the color of the bulb to the color of the form and you must be careful not to allow your user's to enter filled data using the drop-out color. Although the drop-out performance using this method is excellent, identification and registration can be quite difficult unless you can use a barcode or other technique for identification and some form of registration marks for registration. FormFix has functionality to help you register your images using registration marks.
Note: ScanFix has line removal, comb removal, and virtual bulb technology that you can use in place of FormFix's drop-out in some cases. Unlike using color drop-out bulbs on a scanner, you can select between ScanFix and FormFix on a field-by-field basis to get the best performance in every case.
The most common uses of ScanFix are to deskew images before identification and to remove noise on field clips before extracting data from them.
Because drop-out is an important part of forms processing, it's important to follow certain guidelines as described below.
Improving the Quality of Drop-Out
Drop-out problems are usually caused by poor registration, but some may also be due to distortion of the form.
If drop-out leaves pieces of the form across the entire page, the problem is probably poor registration. Please refer to the sections on identifying images for more information. If drop-out leaves pieces of the form in only one or a few areas, the problem is probably distortion of the images. In rare cases, a large difference between the thickness of lines in the filled image and the template image also contributes to dropout issues across the entire page.
When resolving distortion problems, the first thing to consider is the possibility that two separate revisions of the same form are being processed. Sometimes this can be caused by the creator of the form making a change without recording it, or the form's revision number having been changed. This can also be caused by differences in printers. The same form printed by separate printers can contain font changes that cause drop-out to fail. If this is the case, see the section on handling similar forms during identification for additional suggestions.
Possible Methods to Improve Drop-Out Performance
- Turn off PerformReconstruction if you do not need it.
- Try reducing the value of AllowableMisRegistration, but setting that value too small may cause some forms to drop-out poorly.
- With overlapping fields, consider using one larger field instead of multiple overlapping ones. The number of pixels contained in each field is the dominant factor in performance and each pixel that is part of two separate fields takes twice as long to extract.
- In the case of several very small fields that are near each other, assess the use of one large field that covers all the desired pixels. There is a small amount of fixed time used for each field and for very small fields that time can be higher than the time spent processing each pixel.
- If the field is large, but not all the pixels within it are used, study the possibility of using several smaller fields. this will eliminate time spent processing pixels that you will not use.
- Attempt to improve consistency between the thickness/brightness of lines in the filled and template images as early as possible. In other words, be consistent in the printing and scanning process.
Because once the images are scanned, a template cannot be adjusted to match an inconsistent thickness/brightness in the filled images. And, in most situations, adjusting the thickness/brightness of a filled image after scanning can adversely affect the filled content which is used for later data acquisition.
Alternatives
Form drop-out is very similar to the line removal, comb removal, and virtual bulb operations within ScanFix. In some cases, you will get better results by using those technologies.
Comb Removal in ScanFix - before and after images:
Before Comb Removal
After Comb Removal
See Also