ImageGear .NET v25.2 - Updated
Developer Guide / How to Work with... / OCR / How to... / Work with OCR Settings
In This Topic
    Work with OCR Settings
    In This Topic

    In addition to image recognition with default language settings, ImageGear has the ability to specify the language to be recognized on a page. If the language on the recognized page is known prior to recognition, defining the language will make the recognition more precise because the appropriate character sets will be used in the recognition process and dictionaries specific to the language dictionaries will be applied to recognized character constructions.

    Use the LanguageEnabled property of ImGearOCRSettings to define a specific language or languages.

    The list of languages is separated into a few language groups. The languages from one group may be incompatible with languages from other groups. When the languages from different groups are enabled, recognition may return an error. To avoid using incompatible languages, the following list of language groups should be used:

    1. Greek language.
    2. Latin and Cyrillic language group unites CentralEurope languages, Cyrillic languages, WesternEurope languages, Turkish language and Baltic languages. This set of languages includes: AfrikaansAlbanianAndorraArgentinaAustraliaAustriaAzerbaijanCyrillicAzerbaijanLatinBalticBasqueBelarusianBelgiumBosnian
      BrazilBulgarianCanadaCatalanCentralAmericaCentralEuropeChileColombiaCroatianCyrillicCzechDanishDutchEnglishEstonianFaroese
      FinnishFrenchFrisianGermanGreatBritainGuaraniHaniHungarianIcelandicIndonesianIrishItalianJapanLatinOnlyKazakhCyrillicKazakhLatin
      KirghizCyrillicKirundiLatinLatvianLiechtensteinLithuanianLuxembourgishMacedonianMalayMexicoNetherlandsNewZealandNorwegian
      PolishPortugueseQuechuaRhaetoRomanicRomanianRussianRwandaScandinaviaSerbianCyrillicShonaSlovakSlovenianSomaliSorbian
      SouthAfricaSouthAmericaSpanishSwahiliSwedishSwitzerlandTajikCyrillicTurkishTurkmenCyrillicTurkmenLatinUkrainianUSAUzbekCyrillic
      UzbekLatinVenezuelaWesternEuropeWolofXhosaZulu.
    3. ChineseSimplified and ChineseTraditional languages.
    4. ChineseHongKong language.
    5. Japanese language.
    6. Korean language.
    7. Thai language.

     

    You require an ImageGear license that includes support for Asian languages to enable the following:

    • ChineseSimplified
    • ChineseTraditional
    • ChineseHongKong
    • Japanese
    • Korean
    • Thai

    The following example illustrates how to recognize a page containing only French text.

    C#
    Copy Code
    using System;
    using ImageGear.Core;
    using ImageGear.OCR;
    public static string RecognizeFrenchText(ImGearRasterPage rasterPage)
    {
        string resultString = null;
    
        // Initialization of ImGearOCR by default.
        using (ImGearOCR igOcr = ImGearOCR.Create())
        {
            // Turn off all languages.
            foreach (ImGearOCRLanguage language in Enum.GetValues(typeof(ImGearOCRLanguage)))
                        igOcr.Settings.LanguageEnabled[language] = false;
    
            // Turn on only French language.
            igOcr.Settings.LanguageEnabled[ImGearOCRLanguage.FRE] = true;
    
            // Import ImageGear page to recognition repository.
            using (ImGearOCRPage igOcrPage = igOcr.ImportPage(rasterPage))
            {
                igOcrPage.Recognize();
                resultString = igOcrPage.Text;
            }
        }
    
        return resultString;
     }
    
    VB.NET
    Copy Code
    Imports System
    Imports ImageGear.Core
    Imports ImageGear.OCR
    Public Shared Function RecognizeFrenchLanguage(ByVal rasterPage As ImGearRasterPage) As String
        Dim resultString As String = Nothing
    
        Using igOcr As ImGearOCR = ImGearOCR.Create()
    
            For Each language As ImGearOCRLanguage In [Enum].GetValues(GetType(ImGearOCRLanguage))
                        igOcr.Settings.LanguageEnabled(language) = False
            Next
    
            igOcr.Settings.LanguageEnabled(ImGearOCRLanguage.FRE) = True
    
            Using igOcrPage As ImGearOCRPage = igOcr.ImportPage(rasterPage)
                igOcrPage.Recognize()
                resultString = igOcrPage.Text
            End Using
        End Using
    
        Return resultString
    End Function