Modifier
Modifiers change how the following expression should be treated.
Syntax
Section titled “Syntax”let Modifier = ModifierKeyword BooleanSetting ';';
let ModifierKeyword = | 'enable' | 'disable';
let BooleanSetting = | 'lazy' | 'unicode';
Example
Section titled “Example”enable lazy;disable unicode;
[w]*( disable lazy; .+)
Support
Section titled “Support”Modifiers are supported in all flavors.
Support for each mode is gated by the lazy-mode
and ascii-mode
features. Specify features with
the --allowed-features
option.
Behavior
Section titled “Behavior”Modes can be enabled and disabled in any scope.
There are two modifiers that can be enabled or disabled:
Enabling lazy mode means that all repetitions in the same scope are lazy by default; opting out
is done with the greedy
keyword, e.g.
enable lazy;
[w]* greedy
Unicode
Section titled “Unicode”Unicode mode is enabled by default. Disabling it means that the expression in the same scope
is no longer Unicode aware and assumes an ASCII-only input. As a result, shorthand character classes
are compiled differently (e.g. [space]
is compiled to [ \t-\r]
), and Unicode properties (e.g.
[Greek]
) are unavailable. Non-ASCII strings and code points are still allowed.
In JavaScript, Unicode must be disabled in order to use %
, <
and >
word boundaries.
Disabling Unicode can vastly improve runtime performance, especially for [word]
and [digit]
.
Alternatively, you can use [ascii_word]
, [ascii_digit]
, and so on.
Compilation
Section titled “Compilation”Modifiers produce no output, but they change how other expressions are compiled.
Issues
Section titled “Issues”The dot and word boundaries are Unicode-aware in some regex engines even when Unicode mode is disabled.
Some mode modifiers are not yet implemented, most importantly ignore_case
, single_line
and
multi_line
.
History
Section titled “History”- Non-Unicode mode added in Pomsky 0.10
- Lazy mode added in Pomsky 0.3