|
Regular Expressions |
Regular expressions can be used for searching for patterns
rather than literals. For example, it is possible to
search for variables in SciTE property files,
which look like $(name.subname) with the regular expression:
\$([a-z.]+) (or \$\([a-z.]+\) in posix mode).
Replacement with regular expressions allows complex
transformations with the use of tagged expressions.
For example, pairs of numbers separated by a ',' could
be reordered by replacing the regular expression:
\([0-9]+\),\([0-9]+\) (or ([0-9]+),([0-9]+)
in posix mode, or even (\d+),(\d+))
with:
\2,\1
Regular expression syntax depends on a parameter: find.replace.regexp.posix
If set to 0, syntax uses the old Unix style where \( and \)
mark capturing sections while ( and ) are themselves.
If set to 1, syntax uses the more common style where ( and )
mark capturing sections while \( and \) are plain parentheses.
. \ [ ] * + ^ $ and ( ) in posix mode.
.\\a, \b, \f,
\n, \r, \t, \v
match the corresponding C escape char,
respectively BEL, BS, FF, LF, CR, TAB and VT;\r and \n are never matched because in Scintilla,
regular expression searches are made line per line (stripped of end-of-line chars).
[set]^, it matches the characters NOT in the set,
i.e. complements the set. A shorthand S-E (start dash end) is
used to specify a set of characters S up to E, inclusive. The special characters ] and
- have no special meaning if they appear as the first chars in the set. To include both,
put - first: [-]A-Z] (or just backslash them).
| example | match |
[-]|] | matches these 3 chars, |
[]-|] | matches from ] to | chars |
[a-z] | any lowercase alpha |
[^-]] | any char except - and ] |
[^A-Z] | any char except uppercase alpha |
[a-zA-Z] | any alpha |
**) matches zero or more matches of that form.
+\(form\) (or (form) with posix flag) matches
what form matches.
The enclosure creates a set of tags, used for [8] and for
pattern substitution. The tagged forms are numbered starting from 1.
\ followed by a digit 1 to 9 matches whatever a
previously tagged regular expression ([7]) matched.
\< \>\< construct
and/or ending with a \> construct, restricts the
pattern matching to the beginning of a word, and/or
the end of a word. A word is defined to be a character
string beginning and/or ending with the characters
A-Z a-z 0-9 and _. Scintilla extends this definition
by user setting. The word must also be preceded and/or
followed by any character outside those mentioned.
\l\xHH^ $
Most of this documentation was originally written by Ozan S. Yigit.
Additions by Neil Hodgson and Philippe Lhoste.
All of this document is in the public domain.