Define and Edit Regular Expressions
SmartZone ICR/OCR allow you to use Regular Expressions to provide the format of expected data contents. If you know the expected format(s), providing that information will make your recognition results more accurate. (See Regular Expressions for information on syntax.)
Specify the Regular Expression String
Use the SetRegularExpression method (ICR: SetRegularExpression; OCR: SetRegularExpression) to specify the regular expression string.
- Note that the FieldType must be set to RegularExpression for your regular expression to impact the recognition.
try
{
…
//Set FieldType to RegularExpression and specify regular expression string
mySmartZoneICR.Reader.FieldType = Accusoft.SmartZoneICRSdk.FieldType.RegularExpression;
mySmartZoneICR.Reader.SetRegularExpression("(\\d{4}){4}");
//Get the regular expression string previously set
string myRegex = mySmartZoneICR.Reader.GetRegularExpression();
//Perform Recognition
TextBlockResult textBlockResult = mySmartZoneICR.Reader.AnalyzeField(imageToRecognize.ToHbitmap(false));
…
}
catch (Accusoft.SmartZoneICRSdk.InvalidRegularExpressionException ex)
{
MessageBox.Show(ex.Message);
}
catch (Accusoft.SmartZoneICRSdk.BitDepthException ex)
{
MessageBox.Show(ex.Message);
}
try
{
…
//Set FieldType to RegularExpression and specify regular expression string
mySmartZoneOCR.Reader.FieldType = Accusoft.SmartZoneOCRSdk.FieldType.RegularExpression;
mySmartZoneOCR.Reader.SetRegularExpression("(\\d{4}){4}");
//Get the regular expression string previously set
string myRegex = mySmartZoneOCR.Reader.GetRegularExpression();
//Perform Recognition
TextBlockResult textBlockResult = mySmartZoneOCR.Reader.AnalyzeField(imageToRecognize.ToHbitmap(false));
…
}
catch (Accusoft.SmartZoneOCRSdk.InvalidRegularExpressionException ex)
{
MessageBox.Show(ex.Message);
}
catch (Accusoft.SmartZoneOCRSdk.BitDepthException ex)
{
MessageBox.Show(ex.Message);
}
To use regular expression to supplement the recognition of a field type, set the SetFieldRegularExpression method (ICR: SetFieldRegularExpression; OCR: SetFieldRegularExpression).
Supported field types for regular expression are Currency, CurrencyPlus, Date, Email, SocialSecurityNumber, Time, UnitedStatesPhoneNumber, URL, and RegularExpression.
For a regular expression example, consider there is a format for part numbers that consists of alpha and digits, you can use [[alpha]]{2}\d{4} for AB1234.
Get the Regular Expression of the Field Type
Use the GetFieldRegularExpression method (ICR: GetFieldRegularExpression; OCR: GetFieldRegularExpression) to get the regular expression of the field type.
…
mySmartZoneICR.Reader.SetFieldRegularExpression(Accusoft.SmartZoneICRSdk.FieldType.Date, "\\d{2}\\/\\d{2}\\/\\d{4}");
string dateRegularExpression = mySmartZoneICR.Reader.GetFieldRegularExpression(Accusoft.SmartZoneICRSdk.FieldType.Date);}
…
…
mySmartZoneOCR.Reader.SetFieldRegularExpression(Accusoft.SmartZoneOCRSdk.FieldType.Date, "\\d{2}\\/\\d{2}\\/\\d{4}");
string dateRegularExpression = mySmartZoneOCR.Reader.GetFieldRegularExpression(Accusoft.SmartZoneOCRSdk.FieldType.Date);}
…
Setting a regular expression does not mean it will be used in the actual recognition. You will need to set the FieldType properly for the corresponding regular expression to be used.
…
//Set field regular expression for Currency and Date
mySmartZoneICR.Reader.SetFieldRegularExpression(Accusoft.SmartZoneICRSdk.FieldType.Date, "\\d{2}\\/\\d{2}\\/\\d{4}");
mySmartZoneICR.Reader.SetFieldRegularExpression(Accusoft.SmartZoneICRSdk.FieldType.Currency, "\\$\\d{2}");
//Set FieldType to Date, so only the regular expression of the Date field type will be used in this recognition.
mySmartZoneICR.Reader.FieldType = Accusoft.SmartZoneICRSdk.FieldType.Date;
TextBlockResult textBlockResult = mySmartZoneICR.Reader.AnalyzeField(imageToRecognize.ToHbitmap(false));
//Set FieldType to Currency, and the regular expression of the Currency field type will be used in the following recognition process.
mySmartZoneICR.Reader.FieldType = Accusoft.SmartZoneICRSdk.FieldType.Currency;
TextBlockResult textBlockResult = mySmartZoneICR.Reader.AnalyzeField(imageToRecognize.ToHbitmap(false));
…
//Set field regular expression for Currency and Date
mySmartZoneOCR.Reader.SetFieldRegularExpression(Accusoft.SmartZoneOCRSdk.FieldType.Date, "\\d{2}\\/\\d{2}\\/\\d{4}");
mySmartZoneOCR.Reader.SetFieldRegularExpression(Accusoft.SmartZoneOCRSdk.FieldType.Currency, "\\$\\d{2}");
//Set FieldType to Date, so only the regular expression of the Date field type will be used in this recognition.
mySmartZoneOCR.Reader.FieldType = Accusoft.SmartZoneOCRSdk.FieldType.Date;
TextBlockResult textBlockResult = mySmartZoneOCR.Reader.AnalyzeField(imageToRecognize.ToHbitmap(false));
//Set FieldType to Currency, and the regular expression of the Currency field type will be used in the following recognition process.
mySmartZoneOCR.Reader.FieldType = Accusoft.SmartZoneOCRSdk.FieldType.Currency;
TextBlockResult textBlockResult = mySmartZoneOCR.Reader.AnalyzeField(imageToRecognize.ToHbitmap(false));
Setting the field regular expression for RegularExpression field type is equivalent to setting regular expression directly.
// The 2 lines below have the same effect
…
mySmartZoneICR.Reader.SetRegularExpression("[A-Z]{3}");
mySmartZoneICR.Reader.SetFieldRegularExpression(Accusoft.SmartZoneICRSdk.FieldType.RegularExpression, "[A-Z]{3}");
…
// The 2 lines below have the same effect
…
mySmartZoneOCR.Reader.SetRegularExpression("[A-Z]{3}");
mySmartZoneOCR.Reader.SetFieldRegularExpression(Accusoft.SmartZoneOCRSdk.FieldType.RegularExpression, "[A-Z]{3}");
…
Clear the Regular Expression of the Field Type
Use the ClearFieldRegularExpression method (ICR: ClearFieldRegularExpression; OCR: ClearFieldRegularExpression) to clear the regular expression of the field type.
//Clear field regular expression for the Time field type
……
mySmartZoneICR.Reader.ClearFieldRegularExpression(Accusoft.SmartZoneICRSdk.FieldType.Time);
……
//Clear field regular expression for the Time field type
……
mySmartZoneOCR.Reader.ClearFieldRegularExpression(Accusoft.SmartZoneOCRSdk.FieldType.Time);
……
Set Case Insensitivity
Case insensitivity for all the regular expressions is defined by the RegularExpressionCaseInsensitivity property (ICR: RegularExpressionCaseInsensitivity; OCR: RegularExpressionCaseInsensitivity). This property works for all the regular expressions.
……
// Set all regular expression to be case insensitive
mySmartZoneICR.Reader.RegularExpressionCaseInsensitivity = true;
……
……
// Set all regular expression to be case insensitive
mySmartZoneOCR.Reader.RegularExpressionCaseInsensitivity = true;
……