Lookaround
Lookarounds assert that a certain expression matches before or after the current position. As an assertion, a lookaround does not contain any text; it matches between two code points.
Syntax
Section titled “Syntax”let Lookaround = LookaroundPrefix Expression;
let LookaroundPrefix = | '<<' | '>>';
See Expression.
A lookaround must be wrapped in parentheses if it is followed by another expression:
(>> [word]) [Greek]
Note that a lookaround contains an expression, so it introduces a new scope and can include statements.
Example
Section titled “Example”(!<< [w])(>> disable unicode; let aw = [w]; aw{3})
Support
Section titled “Support”Support for lookaround is gated by the lookahead
and lookbehind
features. Specify features with
the --allowed-features
option.
Lookahead is supported almost everywhere. Lookbehind support is more limited:
PCRE does not support arbitrary-length lookbehind. PCRE must be able to determine the length of
the lookbehind in advance, so << 'foo'{3}
works, but << 'foo'+
does not. PCRE has a special case that a lookbehind containing an alternation works even if the
alternatives have different lengths, but each alternative must be constant-length.
JavaScript
Section titled “JavaScript”JavaScript fully supports lookahead and lookbehind. However, lookbehind is still unsupported in some older browsers (notably, Safari up to version 16.3).
Before Java 13, repetition in lookbehind was required to be finite, *
and +
did not work.
Since Java 13, repetition can be unbounded, but may not correctly handle repetition with multiple
quantifiers if one of them is unbounded. Lookbehind also may not contain backreferences.
Python
Section titled “Python”Python supports lookahead and constant-length lookbehind. Repetitions and alternations like
<< 'a' | 'bb'
are forbidden in lookbehind.
Ruby, .NET
Section titled “Ruby, .NET”Full support for both lookahead and lookbehind
Lookaround not supported
Behavior
Section titled “Behavior”Lookahead checks if the contained expression matches at the current position. If it matches, the lookahead succeeds, otherwise it fails. Lookahead can be negated. A negative lookahead succeeds if the expression does not match. After the lookahead succeeded, the regex engine returns to the position in the string where it was before the lookahead, so the string matching the lookahead is not consumed.
Conceptually, lookbehind works in the same way, except that the expression is matched in reverse direction against the text preceding the current position. In reality, however, many regex engines do not match in reverse direction but go back n characters and check if the next n characters match the lookbehind.
Compilation
Section titled “Compilation”>> ...
is compiled to(?=...)
!>> ...
is compiled to(?!...)
<< ...
is compiled to(?<=...)
!<< ...
is compiled to(?<!...)
Issues
Section titled “Issues”The various limitations on lookbehind by different regex engines are not enforced at the moment.
Security concerns
Section titled “Security concerns”Lookbehind can be slow in some regex engines.
History
Section titled “History”Initial implementation in Pomsky 0.1