ImageGear for C and C++ on Windows v19.10 - Updated
Getting Started with OCR
User Guide > How to Work with... > OCR > Getting Started with OCR
The Recognition Component has been deprecated and will be removed in a future major release of the product.

This topic provides information about how to get started using ImageGear OCR:

Attach the Recognition Component

The ImageGear Recognition Component must be attached to the Core component and initialized before using it.

Use the following call to attach the Recognition component:

C and C++
Copy Code
AT_ERRCODE errorCode;
errorCode = IG_comm_comp_attach("REC"); 

If the component has been attached, the function returns IGE_SUCCESS.

To check whether the Recognition component has been already attached, call IG_comm_comp_check("REC"); if the function returns TRUE then the Recognition component has been attached.

Initialize the Recognition Session

After the Recognition component has been attached, the recognition session must be initialized with the function IG_REC_initialize(). The function returns 0 if the session has been initialized successfully. Otherwise, the return value is a number of errors encountered during the initialization.

Recognition component requires additional resource files to be available. See description of IG_REC_initialize() for information on where these files should be located such that the component can access them.

Close the Recognition Session

Function IG_REC_close() closes the recognition session if it is no longer necessary. After closing the session, any calls to the recognition functions are not allowed (except IG_REC_initialize()). To start a new recognition session call IG_REC_initialize() again.

Load and Recognize an Image: Code Example

The code example below demonstrates loading and recognizing the TEST1.TIF image file. The image contains English, machine-printed text. The result will be saved in the TEST1.TXT text file.

C and C++
Copy Code
// Windows includes.
#include <windows.h>
// Include for Accusoft ImageGear.
#include "gear.h"
// Include for ImageGear Recognition component.
#include "i_rec.h"

// Load and recognize the Image.tif image file.
// The result will be saved in the Image.txt text file.
VOID OCRImagetoTextFile()
{
    AT_ERRCOUNT errorCount;
    HIGEAR image;
    HIG_REC_IMAGE recognitionImage;
    errorCount = IG_comm_comp_attach("REC");
    errorCount = IG_REC_initialize();
    errorCount = IG_load_file("Image.tif", &image);
    errorCount = IG_REC_image_import(image, &recognitionImage);
    errorCount = IG_image_delete(image);
    errorCount = IG_REC_output_codepage_set("Windows ANSI");
    errorCount = IG_REC_output_text_format_set(IG_REC_DTXT_TXTS);
    errorCount = IG_REC_image_recognize(recognitionImage);
    errorCount = IG_REC_output_direct_text_write(&recognitionImage, 1, "Image.txt");
    errorCount = IG_REC_image_delete(recognitionImage);
    errorCount = IG_REC_close();
}

Tutorial: Create Your First OCR Project

The following tutorial provides step-by-step instructions on how to create a simple application that loads and recognizes images and saves the recognized data into the file in the specified output format.

Preliminary Steps

Before completing the tutorial steps, complete the preliminary steps outlined in this section:

ImageGear Binary Files to be Accessible from the Application

The following necessary binaries must be located in the working directory of the application or in the directory being specified by the system path:

Recognition re-distributable package (see Distributing Recognition Engine Files with Your Application)

igcore19d.dll must be linked to the application using igcore19d.lib or any other appropriate way. Other dlls do not need to be linked.

ImageGear Header Files to be Included into the Application Code

The following header files must be included into the code:

C and C++
Copy Code
// Include for ImageGear Core.
#include "gear.h"
// Include for ImageGear Recognition component.
#include "i_rec.h"      

Note that all other ImageGear header files that are referenced from these two must be accessible.

Create the Application

  1. Set the solution name and attach the components. Before working with any ImageGear functionality, you need to set the license solution name and attach all necessary components to the Core. These operations must be performed only once per session.
    C and C++
    Copy Code
    AT_ERRCOUNT errorCount;
    AT_ERRCODE errorCode;
    HIGEAR image;
    HIG_REC_IMAGE recognitionImage;
    AT_INT i;
    enumIGRecLangEnable languages[IG_REC_LANG_SIZE];
    HIG_REC_DOCUMENT document;
    // To unlock the toolkit for deployment you must call the
    // IG_lic_solution_name_set, IG_lic_solution_key_set() and possibly the
    // IG_lic_OEM_license_key_set() functions.
    // See Licensing section in ImageGear User Manual for more details.
    // Attach LZW component (if necessary).
    errorCode = IG_comm_comp_attach("LZW");
    // Attach Recognition component.
    errorCode = IG_comm_comp_attach("REC"); 
    
     
  2. Initialize the Recognition component. Before using the Recognition component it must be initialized:
    C and C++
    Copy Code
    errorCount = IG_REC_initialize();        
    
     
  3. Load the image file and import its contents into the Recognition image. ImageGear Core operates on HIGEAR image handles; Recognition component requires its own image representation, that is, HIG_REC_IMAGE. The following code loads the image file and imports HIGEAR into HIG_REC_IMAGE:
    C and C++
    Copy Code
    errorCount = IG_load_file("..\\RecognitionC\\Image.tif", &image);
    errorCount = IG_REC_image_import(image, &recognitionImage);
    // HIGEAR can be deleted now.
    IG_image_delete(image); 
    
  4. Set recognition languages. Recognition languages must be set before processing and pre-processing. Note that spelling and correction are enabled by default. Use IG_REC_spelling_is_enabled_set and IG_REC_correction_is_enabled_set with the FALSE parameter to disable them.
    C and C++
    Copy Code
    // Disable all languages.
    for(i = 0; i < IG_REC_LANG_SIZE; i ++)
    {
        languages[i] = IG_REC_LANG_DISABLED;
    }
    // Enable English and French.
    languages[IG_REC_LANG_ENG] = IG_REC_LANG_ENABLED;
    languages[IG_REC_LANG_FRE] = IG_REC_LANG_ENABLED;
    
    // Set English and French recognition languages.
    nErrCount = IG_REC_languages_set(languages);    
    
  5. Preprocess the image. Preprocessing improves the recognition process. It performs such operations as fax correction, deskewing, and rotation. Preprocessing also creates an internal BW image with the secondary resolution conversion and performs inversion (if necessary) and despeckle operations on it.
    C and C++
    Copy Code
    // Preprocess the image.
    nErrCount = IG_REC_image_preprocess(recognitionImage);
    // Recognize the image.
    nErrCount = IG_REC_image_recognize(recognitionImage);   
    
  6. Set the output format, code page, and output level. Before saving the recognized data into the file, the output format, code page, and level must be set.
    C and C++
    Copy Code
    // Set Word2000 output format.
    errorCount = IG_REC_output_format_set("Converters.Text.Word2000");
    // Set Windows ANSI code page.
    errorCount = IG_REC_output_codepage_set("Windows ANSI");
    // Set True Page output level.
    errorCount = IG_REC_output_level_set(IG_REC_OL_TRUEPAGE);       
    
  7. Save the document. Formatted output in most formats is performed using the document architecture (except in cases that you need simple text or xml output. In this case you can use IG_REC_output_direct_text_write function. Please refer to Saving the Recognized Data Directly in the Text Format).
    C and C++
    Copy Code
    // Create a document.
    errorCount = IG_REC_document_create(NULL, &document);
    
    // Insert a page into the document.
    // Note that after inserting the page does not need to be deleted.
    errorCount = IG_REC_document_page_insert(document, recognitionImage, -1);
    
    // Save a document into the file in Word2000 format (being set above).
    errorCount = IG_REC_document_write(document, "Tutorial.DOC");
    
    // Close document.
    IG_REC_document_close(document);        
    
  8. Close the Recognition component.
    C and C++
    Copy Code
    IG_REC_close();
    

See Also

Recognition Component API Reference