ImageGear for C and C++ on Windows v19.1 - Updated
Search
User Guide > How to Work with... > OCR > How to... > Auto-Redact > Search

The search operation is conducted by an ImageGear approximate regex object (HIG_REC_APPROX_REGEX) against a page (HIGEAR or HIG_REC_IMAGE), a document (HMIGEAR or HIG_REC_DOCUMENT), or a Unicode string (LPAT_WCHAR) to produce an array of matches (AT_REC_MATCH_RESULT).

To search a page, document, or string:

Create an Approximate Regex Object

Manage the lifetime of the approximate regex object using IG_REC_approx_regex_create and IG_REC_approx_regex_delete.

Create an approximate regex object using the function IG_REC_approx_regex_create. When it is no longer needed, use the function IG_REC_approx_regex_delete to release its resources.

ImageGear HIG_REC_APPROX_REGEX instances are not thread-safe. Callers are responsible for synchronizing access to instances shared across multiple threads before invoking operations that could modify or delete that instance.

Configure the Search

After creating an approximate regex object, configure it to perform an exact or approximate search and to broadcast notifications as the search is conducted. Configurable settings include:

Search for Matches

After configuration is complete, conduct a search of any of these supported types to recover an array of matches:

Zero Matches?

A search that reveals zero matches may not be valid. Image resolution will affect the accuracy of ImageGear Recognition engine OCR, and consequently auto-redact.

Consider an attempt to redact the text “football” from the 96 DPI page depicted below:

A search for the pattern “football” fails to recover any matches. The OCR text recovered from the 96-dpi page, used as the search domain, is not accurate:

eApefscp :sainmscins ON Jalial u :seinffisqns oivki suogwetrio :amffiscins auo
necnooi. :seielep ON Jouids :selelep omi illy° :alelep au°
sseuiseeivks :spesu! ON wndpuei]eizi :spesu! oftni
19(1981701e :pasu! auo

For this particular example, using the ImageGear function IG_image_resolution_set to change the page’s reported resolution from 96 DPI to 128 DPI is sufficient to coerce the expected OCR text:

One insert: alph4abet Two inserts: refe[rendpum No inserts: sweetness
One delete: crittr Two deletes: spiner No deletes: football
One substitute: chambions Two substitutes: n telfer No substitutes: objective

Repeating the search for the pattern “football” locates a single match that is subsequently redacted, as depicted below:

The page OCR Performance Issues offers some additional suggestions that may improve the accuracy of ImageGear’s recognition engine for some images.