The smallest building block of a regular expression is an atom that corresponds directly to one character in the target field. An atom can be one of the following:
- Any alphanumerical character standing for itself. In the serial number example in Using Regular Expressions in the User Dictionary, the 'S' is such an atom.
- Any non-alphanumerical character that is not defined as a meta-character stands for itself. In the serial number example, the '/' is such an atom. Meta-characters have special meanings within regular expressions, such as the asterisk, brackets, or braces in our example. Here is a complete list of meta-characters: '\', '|', '(', ')', '[', ']', '{', '}', '+', '*', '?', '^', '$', '.'.
- Any meta-character prefixed with a backslash ('\') stands for itself. For example, if you want to specify a decimal point within a number, you must use '\.' rather than a simple dot that would be taken as a meta-character (see below). The same holds for the backslash character itself. You can also prefix any other non-alphanumeric character with a backslash for the same effect when you are not sure if it is a meta-character or not.
- Any character expressed as a backslash character and its 4 digit hexadecimal code stands for itself. For example, '\0041' matches the capital A character only.
- The dot character '.' stands for any character recognized. This is the wildcard character in regular expressions, similar to the question mark (?) in filenames.
- A user-defined character class stands for any character that is a member of that set. The characters of the set are defined within square brackets. In the serial number example the '[0-9]' is such an atom.
- A pre-defined character class or set stands for any character that is a member of that set. Pre-defined sets are denoted by an alphabetical character prefixed with a backslash, as in '\d'. You can see the list of pre-defined sets in the next section.
Though a portion of a regular expression enclosed within parentheses '( )' can correspond to more than one character in the target field, syntactically we treat it as an atom, so that all operations that can be applied to atoms can also be used with parenthesized regular expressions. See an example for this in the Multipliers topic.