ImageGear Recognition API allows saving recognized data to a number of simple text and XML formats. Use ImGearRecOutputManager.WriteDirectText methods to write recognized data of an ImGearRecPage object, or an array of ImGearRecPage objects, to a file or a stream as text. Use ImGearRecOutputManager.DirectTextFormat property to get or set the format for saving. The following formats are available:
- Simple Text.
- Comma Separated Text, which can be used to represent tables.
- Formatted Text, which delivers plain text, but attempts to keep layout (columns and boxes) as detected in the original image using tabulators.
- Simple XML with letter coordinates, which is typically used for further processing recognized data. You can easily parse (e.g., MSXML) or transform (XSLT) the output xml file. The format of the xml output is specified by the same scheme as the Layout Retention Xml Output. Refer to Nuance XML schema, ssdoc-schema3.xsd, distributed in the ImageGear .NET installation's Bin directory.
Writing Results to a File
The following example loads an image file, recognizes it, and outputs it to a file as formatted text.
C# |
Copy Code |
using (FileStream content = new FileStream("test1.tif", FileMode.Open))
{
ImGearPage igPage = ImGearFileFormats.LoadPage(content, 0);
ImGearRecPage recPage = igRecognition.ImportPage((ImGearRasterPage)igPage);
recPage.Image.Preprocess();
recPage.Recognize();
igRecognition.OutputManager.CodePage = "Windows ANSI";
igRecognition.OutputManager.DirectTextFormat = ImGearRecDirectTextFormat.FormattedText;
if (File.Exists("singlePage.TXT"))
{
File.Delete("singlePage.TXT");
}
igRecognition.OutputManager.WriteDirectText(recPage, "singlePage.TXT");
recPage.Dispose();
} |
VB.NET |
Copy Code |
Using content As New FileStream("test1.tif", FileMode.Open)
Dim igPage As ImGearPage = ImGearFileFormats.LoadPage(content, 0)
Dim recPage As ImGearRecPage = igRecognition.ImportPage(DirectCast(igPage, ImGearRasterPage))
recPage.Image.Preprocess()
recPage.Recognize()
igRecognition.OutputManager.CodePage = "Windows ANSI"
igRecognition.OutputManager.DirectTextFormat = ImGearRecDirectTextFormat.FormattedText
If File.Exists("singlePage.TXT") Then
File.Delete("singlePage.TXT")
End If
igRecognition.OutputManager.WriteDirectText(recPage, "singlePage.TXT")
recPage.Dispose()
End Using |
Writing Results to a Stream
The following example loads an image file, recognizes it, and outputs it to a stream as formatted text.
C# |
Copy Code |
string resultText = "";
using (MemoryStream stream = new MemoryStream())
{
ImGearPage igPage = ImGearFileFormats.LoadPage(content, 0);
ImGearRecPage recPage = igRecognition.ImportPage((ImGearRasterPage)igPage);
recPage.Image.Preprocess();
recPage.Recognize();
igRecognition.OutputManager.CodePage = "Windows ANSI";
igRecognition.OutputManager.DirectTextFormat = ImGearRecDirectTextFormat.FormattedText;
igRecognition.OutputManager.WriteDirectText(recPage, stream);
using (StreamReader reader = new StreamReader(stream))
{
stream.Seek(0, SeekOrigin.Begin);
resultText = reader.ReadToEnd();
}
recPage.Dispose();
} |
VB.NET |
Copy Code |
Dim resultText As String
Using stream As New MemoryStream()
Dim igPage As ImGearPage = ImGearFileFormats.LoadPage(content, 0)
Dim recPage As ImGearRecPage = igRecognition.ImportPage(DirectCast(igPage, ImGearRasterPage))
recPage.Image.Preprocess()
recPage.Recognize()
igRecognition.OutputManager.CodePage = "Windows ANSI"
igRecognition.OutputManager.DirectTextFormat = ImGearRecDirectTextFormat.FormattedText
igRecognition.OutputManager.WriteDirectText(recPage, stream)
Using reader As New StreamReader(stream)
stream.Seek(0, SeekOrigin.Begin)
resultText = reader.ReadToEnd()
End Using
recPage.Dispose()
End Using |
A more advanced approach to outputting recognized data is to use Formatted Output. However, this requires ImGearRecLicenseFeature.FormattedOutput to be enabled. For more details, see Export to a Formatted Document.