ImageGear for C and C++ on Windows v19.10 - Updated
Concatenation and Alternation
User Guide > How to Work with... > OCR > Technical Specifications > Regular Expressions > Anatomy of a Regular Expression > Concatenation and Alternation

To describe a target field of more than one character length, you need to put together several atoms with possible multipliers. This operation is called concatenation, and its operator is implicit in regular expressions. This is analogous to the implicit multiplication operator in algebra, when you write something like 2xy. For example, there is an implicit concatenation operator between 'S' and '/' or between ' *' and '[A-Z]' in the serial number matching regular expression example in the Using Regular Expressions in the User Dictionary section.

The vertical bar ('|') operator used in regular expressions is analogous to the additive operator in algebra, as in 2xy+3z. When you concatenate atoms with or without multipliers you generate an alternative or branch. Then you can put together such alternatives using the vertical bar ('|') to build a regular expression that can match more complex targets. For example, 'DOC|RTF' matches target fields containing either DOC or RTF. Sometimes you need to use parentheses, just like in algebra, to express more complicated cases. To match filenames with extensions of either DOC or RTF, you would write '.*\.(DOC|RTF)' (note the backslash before the second dot, which specifies the actual dot character, not 'any character'). This expression is analogous to the algebraic expression of 2x(y+z). Had we written it without the parentheses, '.*\.DOC|RTF', this regular expression would not match something like MY.RTF, but would match MY.DOC or RTF alone.