Regular expressions in Mathematica
Overview
Mathematica's regular expression flavor
Matching and replacing
Replacing
Case-sensitivity
More about regular expressions
Overview
This page is written for the benefit of someone familiar with regular expressions but not with the use of regular expressions in Mathematical. Comparisons will be made with Perl for those familiar with the language, though no knowledge of Perl is required.
Mathematica's regular expression flavor
Mathematica uses essentially the same regular expression flavor as Perl
5. Specifically, Mathematica is compatible with
PCRE. Note that metacharacters require
two backslashes. For example, the\d
shortcut
for matching any digit must be written as \\d
.
Matching
The Mathematica function StringFreeQ
is analogous to the m//
operator in Perl. However, The logic of StringFreeQ
is inverted compared to m//
because it returns whether a string is "free" of a pattern, i.e. it
returns True
if the string does not
contain the pattern and False
if it does.
The first argument to StringFreeQ
is the string to search. The second
argument can be a simple string or a regular expression.
Examples:
StringFreeQ["Hello world", "ello"]
returns False
because the string "Hello world" does contain "ello".
StringFreeQ["Hello world", "el+o"]
returns True
because "Hello world" does
not contain the literal string "el+o". However
returns
StringFreeQ["Hello world", RegularExpression["el+o"]]
False
because "Hello world" does
match the regular expression el+o
.If you want retrieve the text matched rather than simply asking whether there was a match, use
StringCases
. This function returns a list
containing all matches. Of course the list will be empty if there were no
matches.Replacing
The Mathematica function
StringReplace
is analogous to Perl's s///
operator. The first argument is the string to operate on. The second is
a regular expression followed by ->
and a replacement string. Example:
StringReplace["Hello world", RegularExpression["world"] -> "planet"]
returns "Hello planet". Note that StringReplace does not modify its
arguments. If the replacement pattern needs to reference captured
subexpressions, these can be accessed by $1
,
$2
, etc. just as in Perl.
Example:
StringReplace["Hello world", RegularExpression["(world)"] -> "planet $1"]
returns "Hello planet world".
Note that StringReplace replaces all matches in a
string by default, and so it is more precisely analogous to
s//g
than s//
.
To limit the number of matches, add a third argument specifying the
maximum number of replacements. For example, adding 1 as the last
argument causes only the first instance to be replaced.
Example:
StringReplace["Hello world", RegularExpression["o"] -> "x", 1]
returns "Hellx world". Without the final argument it would have returned "Hellx wxrld."
Case-sensitivity
The m//
and s///
operators in Perl are case-sensitive by default. Perl has two ways of making
these operators case-insensitive. One is by appending an 'i
'
following the operator. The other is to add (?i)
to the beginning of the regular expression.
Mathematica is also case-sensitive by default, and it also has two
ways of changing the case-sensitivity. One is to use the attribute
IgnoreCase -> True
. The other is to add
(?i)
to the regular expression as in Perl.
More about regular expressions
Notes on using regular expression in other languages: C++, Python, R, PowerShell
Tips for getting started with regular expressions
Daily tips on regular expressions