ImageGear .NET v24.14 - Updated
Export to Text Formats
User Guide > How to Work with... > OCR > How to... > Access and Analyze OCR Output > Export to Text Formats

ImageGear Recognition API allows saving recognized data to a number of simple text and XML formats. Use ImGearRecOutputManager.WriteDirectText methods to write recognized data of an ImGearRecPage object, or an array of ImGearRecPage objects, to a file or a stream as text. Use ImGearRecOutputManager.DirectTextFormat property to get or set the format for saving. The following formats are available:

Writing Results to a File

The following example loads an image file, recognizes it, and outputs it to a file as formatted text.

C#
Copy Code
using (FileStream content = new FileStream("test1.tif", FileMode.Open))
{
      ImGearPage igPage = ImGearFileFormats.LoadPage(content, 0);
      ImGearRecPage recPage = igRecognition.ImportPage((ImGearRasterPage)igPage);
      recPage.Image.Preprocess();
      recPage.Recognize();
      igRecognition.OutputManager.CodePage = "Windows ANSI";
     igRecognition.OutputManager.DirectTextFormat = ImGearRecDirectTextFormat.FormattedText;
      if (File.Exists("singlePage.TXT"))
      {
           File.Delete("singlePage.TXT");
      }
      igRecognition.OutputManager.WriteDirectText(recPage, "singlePage.TXT");
      recPage.Dispose();
}
VB.NET
Copy Code
Using content As New FileStream("test1.tif", FileMode.Open)
      Dim igPage As ImGearPage = ImGearFileFormats.LoadPage(content, 0)
      Dim recPage As ImGearRecPage = igRecognition.ImportPage(DirectCast(igPage, ImGearRasterPage))
      recPage.Image.Preprocess()
      recPage.Recognize()
      igRecognition.OutputManager.CodePage = "Windows ANSI"
      igRecognition.OutputManager.DirectTextFormat = ImGearRecDirectTextFormat.FormattedText
      If File.Exists("singlePage.TXT") Then
           File.Delete("singlePage.TXT")
      End If
      igRecognition.OutputManager.WriteDirectText(recPage, "singlePage.TXT")
      recPage.Dispose()
End Using

Writing Results to a Stream

The following example loads an image file, recognizes it, and outputs it to a stream as formatted text.

C#
Copy Code
string resultText = "";
using (MemoryStream stream = new MemoryStream())
{
    ImGearPage igPage = ImGearFileFormats.LoadPage(content, 0);
    ImGearRecPage recPage = igRecognition.ImportPage((ImGearRasterPage)igPage);
    recPage.Image.Preprocess();
    recPage.Recognize();
    igRecognition.OutputManager.CodePage = "Windows ANSI";
    igRecognition.OutputManager.DirectTextFormat = ImGearRecDirectTextFormat.FormattedText;
    igRecognition.OutputManager.WriteDirectText(recPage, stream);
    using (StreamReader reader = new StreamReader(stream))
    {
        stream.Seek(0, SeekOrigin.Begin);
        resultText = reader.ReadToEnd();
    }
    recPage.Dispose();
}
VB.NET
Copy Code
Dim resultText As String
Using stream As New MemoryStream()
    Dim igPage As ImGearPage = ImGearFileFormats.LoadPage(content, 0)
    Dim recPage As ImGearRecPage = igRecognition.ImportPage(DirectCast(igPage, ImGearRasterPage))
    recPage.Image.Preprocess()
    recPage.Recognize()
    igRecognition.OutputManager.CodePage = "Windows ANSI"
    igRecognition.OutputManager.DirectTextFormat = ImGearRecDirectTextFormat.FormattedText
    igRecognition.OutputManager.WriteDirectText(recPage, stream)
    Using reader As New StreamReader(stream)
        stream.Seek(0, SeekOrigin.Begin)
        resultText = reader.ReadToEnd()
    End Using
    recPage.Dispose()
End Using

A more advanced approach to outputting recognized data is to use Formatted Output. However, this requires ImGearRecLicenseFeature.FormattedOutput to be enabled. For more details, see Export to a Formatted Document.