Excel Function: REGEXTEST (2/3)
Character Quantity
Quantifiers allow you to specify how many times a character can appear. To specify a quantity or a range of quantities, add {} after the concerned character.
For example:
- {2}: the character must appear exactly 2 times
- {3,}: the character must appear 3 times or more
- {1,4}: the character must appear between 1 and 4 times
The regex "Composer{0,}" checks here if the text contains this word with none, one, or multiple "r":
=REGEXTEST(A2,"Composer{0,}")

There are also 3 handy shortcuts for quantifying a character:
- *: the character must appear 0 times or more (equivalent to
{0,}) - +: the character must appear 1 time or more (equivalent to
{1,}) - ?: the character must appear 0 or 1 time (equivalent to
{0,1})
The regex "Composer*" is therefore identical to the previous one:
=REGEXTEST(A2,"Composer*")

Testing a Structure
To test the structure of a text (for example a reference, a URL, a phone number, etc.), you will need to use several elements seen so far in a single expression.
For example, to validate the regex "^XLP-[0-9]{4}-" the text must:
- ^XLP-: start with "XLP-"
- [0-9]{4}: be followed by 4 characters in the range 0 to 9
- -: be followed by a "-"
=REGEXTEST(A2,"^XLP-[0-9]{4}-")

Special Characters
Metacharacters are special characters that have a specific role (you already know ^ $ [ ] | ? * + { }).
To cancel the effect of a metacharacter and treat it as a simple character, precede it with a \ (except when the metacharacter is inside a character class).
Thanks to \, the regex " \?$" can check if the text ends with " ?":
=REGEXTEST(A2," \?$")

\ ^ $ . [ ] | ( ) ? * + { }.\t equals a tab, \r a carriage return, and \n a line break.Character Class Shortcuts
As you now know, character classes define allowed or disallowed characters. There are some handy shortcuts to simplify this:
- \d: a digit (equivalent to
[0-9]) - \D: anything but a digit (equivalent to
[^0-9]) - \w: an alphanumeric character or a "_" (equivalent to
[a-zA-Z0-9_], also including Unicode letters, like accented characters) - \W: anything but an alphanumeric character or a "_" (equivalent to
[^a-zA-Z0-9_], excluding Unicode letters) - \s: a whitespace character, that is, a space, a line break, a carriage return, or a tab
[ \n\r\t] - \S: anything but a whitespace character (equivalent to
[^ \n\r\t]) - \b: a word boundary (start or end of a word), between a
\wand a\W - .: any character except a line break (equivalent to
[^\n])
The regex "^[A-Z]{2}.\w{4}$" checks if the text starts with 2 characters from A-Z, followed by any character (except line break), followed by (and ending with) 4 alphanumeric characters:
=REGEXTEST(A2,"^[A-Z]{2}.\w{4}$")

Case-Insensitive Search
To search for the word "excel" without considering case sensitivity, add (?i) at the beginning of the regex "(?i)excel":
=REGEXTEST(A2,"(?i)excel")

Or, even simpler, you can directly enter the value 1 as the third argument of the REGEXTEST function:
=REGEXTEST(A2,"excel",1)

Email Address Validation
A regex to validate an email address (simplified version) could look like "^[a-z0-9_.-]+@[a-z0-9_.-]{2,}\.[a-z]{2,}$".
To validate this regex, the text must:
- ^[a-z0-9_.-]+: start with 1 or more lowercase letters, digits, "_" or "." or "-"
- @: be followed by "@"
- [a-z0-9_.-]{2,}: be followed by 2 or more lowercase letters, digits, "_" or "." or "-"
- \.: be followed by a dot
- [a-z]{2,}$: be followed (and end) with 2 or more characters from a-z
=REGEXTEST(A2,"^[a-z0-9_.-]+@[a-z0-9_.-]{2,}\.[a-z]{2,}$")

\w is not used here instead of [a-zA-Z0-9_] because \w also includes accented characters, which is not desirable here.