Substitutions
Regular expression substitution can
be used to program automatic editing operations. For
example, the following are search and replace strings
to find occurrences of the `C' language subroutine
`get_x', reverse the first
and second parameters, add a third parameter of NULL,
and change the name to `new_get_x':
- Search string: `get_x
*\( *([^ ,]*), *([^\)]*)\)'
- Replace string: `new_get_x(\2,
\1, NULL)'
Ambiguity
If a regular expression could match
two different parts of the text, it will match the
one which begins earliest. If both begin in the same
place but match different lengths, or match the same
length in different ways, life gets messier, as follows.
In general, the possibilities in a
list of alternatives are considered in left-to-right
order. The possibilities for `*',
`+', and `?'
are considered longest-first, nested constructs are
considered from the outermost in, and concatenated
constructs are considered leftmost-first. The match
that will be chosen is the one that uses the earliest
possibility in the first choice that has to be made.
If there is more than one choice, the next will be
made in the same manner (earliest possibility) subject
to the decision on the first choice. And so forth.
For example, `(ab|a)b*c'
could match `abc' in one
of two ways. The first choice is between `ab'
and `a'; since `ab'
is earlier, and does lead to a successful overall
match, it is chosen. Since the `b'
is already spoken for, the `b*'
must match its last possibility, the empty string,
since it must respect the earlier choice.
In the particular case where no `|'s
are present and there is only one `*',
`+', or `?',
the net effect is that the longest possible match
will be chosen. So `ab*',
presented with `xabbbby',
will match `abbbb'. Note
that if `ab*' is tried
against `xabyabbbz', it
will match `ab' just after
`x', due to the begins-earliest
rule. (In effect, the decision on where to start the
match is the first choice to be made, hence subsequent
choices must respect it even if this leads them to
less-preferred alternatives.)
|