Modifier
Modifiers change how the following expression should be treated.
Syntax
Section titled “Syntax”let Modifier = ModifierKeyword BooleanSetting ';';
let ModifierKeyword = | 'enable' | 'disable';
let BooleanSetting = | 'lazy' | 'unicode';Example
Section titled “Example”enable lazy;disable unicode;
[w]*( disable lazy; .+)Support
Section titled “Support”Modifiers are supported in all flavors.
Support for each mode is gated by the lazy-mode and ascii-mode features. Specify features with
the --allowed-features option.
Behavior
Section titled “Behavior”Modes can be enabled and disabled in any scope.
There are two modifiers that can be enabled or disabled:
Enabling lazy mode means that all repetitions in the same scope are lazy by default; opting out
is done with the greedy keyword, e.g.
enable lazy;
[w]* greedyUnicode
Section titled “Unicode”Unicode mode is enabled by default. Disabling it means that the expression in the same scope
is no longer Unicode aware and assumes an ASCII-only input. As a result, shorthand character classes
are compiled differently (e.g. [space] is compiled to [ \t-\r]), and Unicode properties (e.g.
[Greek]) are unavailable. Non-ASCII strings and code points are still allowed.
In JavaScript, Unicode must be disabled in order to use %, < and > word boundaries.
Disabling Unicode can vastly improve runtime performance, especially for [word] and [digit].
Alternatively, you can use [ascii_word], [ascii_digit], and so on.
Compilation
Section titled “Compilation”Modifiers produce no output, but they change how other expressions are compiled.
Issues
Section titled “Issues”The dot and word boundaries are Unicode-aware in some regex engines even when Unicode mode is disabled.
Some mode modifiers are not yet implemented, most importantly ignore_case, single_line and
multi_line.
History
Section titled “History”- Non-Unicode mode added in Pomsky 0.10
- Lazy mode added in Pomsky 0.3