Note that an input that takes regular expression can take either:
- a literal string, e.g. "Hello"
- an entirely regular expression encoded string, e.g. "\d{2,3}"
- a mix of the two, e.g. "Hello.+".
The operators .contains() and .containsAnyOf() use Apple's regular expression engine.
In action codes (or other operators) that are described as using regular expression patterns for their input arguments (i.e. 'patterns'), it is possible to use the \uNNNN
method to use the 4-digit Unicode code point number for a character to test for a particular character, especially those that cannot easily be typed. Thus a non-breaking space, which looks like a normal space when querying text, is encoded as \u00A0
, whilst a normal space is \u0020
:
$Text = $Text.replace("\u00A0","\u0020");
…replaces every instance of a non-breaking space with a normal space character.
An alternative, if the character is in the ASCII range, is to use the ASCII decimal number in the form \xNN
. Thus:
$Text = $Text.replace("XXX","\x22");
…replaces every instance of 'XXX' with a straight double quote character.
Some reference links:
- Syntax for regular expressions using the Apple Regular Expression engine.
- Wikipedia ASCII codes. IMPORTANT: use the values in the 'Hex' column. Thus a straight double quote is
22
, a straight single quote is27
.
Historical note re pre-v6 Tinderbox
Originally, regular expressions in Tinderbox used Perl language conventions as further defined in documentation for the Boost regex code library: https://www.boost.org/doc/libs/1_34_1/libs/regex/doc/syntax_perl.html.