Comparison with other projects
This wiki has a list of projects with similar goals to Pomsky. Here’s a list of the most popular projects:
Project | Types | GitHub |
---|---|---|
Melody | Transpiled | |
Pomsky | Transpiled | |
Egg Expressions | Transpiled App: Oil shell | |
Rx Expressions | Transpiled App: Emacs | |
Raku Grammars | App: Raku | |
Rosie | App: Rosie | |
SRL | DSL: PHP | |
Super Expressive | DSL: JS | |
Verbal Expressions | DSL: JS | |
Swift RegexBuilder | DSL: Swift |
Since this content is likely to get out of date, I encourage you to [update it][edit-on-github].
Transpiled
Section titled “Transpiled”These languages are transpiled to “normal” regular expressions and can therefore be used anywhere. They usually have command-line interface to compile expressions.
Application specific
Section titled “Application specific”Some regex languages are specific to a certain application or programming language. For example, Raku grammars can only be used in Raku; egg expressions are transpiled, but they are only available in the Oil shell.
DSLs (domain-specific languages) are languages that are embedded in another language using the host language’s syntax. For example, Verbal Expressions uses JavaScript methods:
const tester = VerEx() .startOfLine() .then('http') .maybe('s') .then('://') .maybe('www.') .anythingBut(' ') .endOfLine()
This page currently only discusses transpiled languages, but I welcome contributions.
Compatibility
Section titled “Compatibility”Let’s see what Regex flavors are supported by transpiled languages.
Flavor | Melody | Pomsky | Egg Expr. | Rx Expr. |
---|---|---|---|---|
ERE | ✅ | ✅ | ||
ECMAScript | ✅ | ✅ | ||
PCRE | ✅* | ✅ | ||
.NET | ✅* | ✅ | ||
Java | ✅* | ✅ | ||
Ruby | ✅* | ✅ | ||
Python | ✅ | |||
Rust | ✅ | |||
RE2 | ✅ |
*Melody can only emit ECMAScript regexes, but they also happen to be compatible with several other flavors.
Explanation of the flavors
Section titled “Explanation of the flavors”-
ERE (extended regular expressions) are used by tools such as GNU grep and awk. Because ERE supports only the most basic features, it is mostly forward compatible with other regex flavors.
-
ECMAScript is the syntax used in JavaScript and related languages (TypeScript, Elm, Dart, etc.) that are compiled to JS.
-
PCRE (an acronym for “Perl compatible regular expression”) is the syntax used by the PCRE2 regex engine, which is the default in at least Crystal, Delphi, Elixir, Erlang, Hack, Julia, PHP, R, and Vala. It’s also a popular choice in other languages like C and C++ and is used in many applications such as the Apache server, nginx, MariaDB, MongoDB, and optionally in GNU grep.
-
.NET refers to the .NET regular expressions, used by languages such as C# and F#.
-
Java refers to the Pattern class in the Java standard library. Also used in Kotlin and Scala.
-
Ruby refers to built-in regular expressions in Ruby (using the oniguruma regex library).
-
Python refers to Python’s re module. Note that Python 3 is required for good Unicode support.
-
Rust refers to Rust’s popular regex crate (used by ripgrep)
-
RE2 refers to Google’s re2 library; this flavor is also compatible with Go’s regexp package.
Many more flavors exist, which are not (or only partially) supported by Pomsky and other languages.
Features
Section titled “Features”Let’s see what Regex features are supported by languages that are transpiled to regular expressions.
Basic regex features
Section titled “Basic regex features”Feature | Melody | Pomsky | Egg Expr. | Rx Expr. |
---|---|---|---|---|
Greedy repetition | ✅ | ✅ | ✅ | ✅ |
Lazy repetition | ✅ | ✅ | ✅ | ✅ |
Dot | ✅ | ✅ | ✅ | ✅ |
Character escape | ✅ | ✅ | ✅ | ✅ |
Character class | ✅ | ✅ | ✅ | ✅ |
Anchor | ✅ | ✅ | ✅ | ✅ |
Word boundary | ✅ | ✅ | ✅ | ✅ |
Negated word boundary | ✅ | ✅ | ✅ | ✅ |
Character range | partly* | ✅ | ✅ | ✅ |
Character set | ✅ | ✅ | ✅ | |
Negated character set | partly* | ✅ | ✅ | ✅ |
Capturing group | ✅ | ✅ | ✅ | ✅ |
Alternation | ✅ | ✅ | ✅ | ✅ |
POSIX class | ✅ | ✅ | ✅ | |
Non-capturing group | ✅ | ✅ | ✅ |
*Character ranges and negated sets in Melody only support ASCII letters, digits and a few special characters.
Advanced features
Section titled “Advanced features”Feature | Melody | Pomsky | Egg Expr. | Rx Expr. |
---|---|---|---|---|
Variable/macro | ✅ | ✅ | ✅ | ✅ |
Line comment | ✅ | ✅ | ✅ | ✅ |
Block comment | ✅ | |||
Code point | ✅ | ✅ | ||
Lookaround | ✅ | ✅ | ||
Named capturing group | ✅ | ✅ | ✅ | |
Backreference | ✅ | ✅ | ||
Named backreference | ✅ | |||
Relative backreference | ✅ | |||
Unicode category | ✅ | ✅ | ||
Unicode script/block | ✅ | partly | ||
Unicode script extensions | ✅ | |||
Other Unicode property | ✅ | |||
Any code point | partly* | ✅ | partly* | partly* |
Any grapheme | ✅ | |||
Atomic group | ✅ | |||
Character set intersection | ✅ | ✅ | ||
Conditional | ||||
Recursion | ✅ | |||
Modifier | ||||
Inline regex | ✅ | ✅ | ||
Optimization | some** |
Note that Melody and Pomsky support inline regexes. Because of this, all Regex features are technically supported in Melody and Pomsky, but using inline regexes may be less ergonomic and more dangerous to use than properly supported features.
*All languages can match a code point with the dot, if multiline mode is enabled in the regex engine.
**Pomsky can currently
- optimize repetitions
- remove redundant or empty groups
- in character sets, deduplicate code points and merge overlapping ranges
- merge single-character alternations into character sets
More optimizations are planned.
Tooling
Section titled “Tooling”Tool | Melody | Pomsky | Egg Expr. | Rx Expr. |
---|---|---|---|---|
CLI | ✅ | ✅ | ||
REPL | ✅ | ✅ | ||
Online playground | ✅ | ✅ | ||
VSCode extension | ✅ | ✅ | ||
IntelliJ extension | ✅ | |||
JavaScript bunder | Babel | Vite, Rollup, ESBuild, Webpack | ||
Rust macro | ✅ | |||
Linter | ||||
Formatter |
Packages
Section titled “Packages”Tool | Melody | Pomsky |
---|---|---|
Homebrew | ✅ | ✅ |
AUR | ✅ | ✅ |
Nix | ✅ | ✅ |
GitHub release binary (Apple) | ✅ | ✅ |
GitHub release binary (Windows) | ✅ | |
GitHub release binary (Linux) | ✅ | |
Node module | ✅ | ✅ |
Python module | ✅ |
IDE features
Section titled “IDE features”Feature | Melody | Pomsky |
---|---|---|
Syntax highlighting | ✅ | ✅ |
Error highlighting | ✅ | |
Code folding | ✅1 | ✅1 |
Auto indentation | ✅ | ✅ |
Snippets | ✅ | ✅ |
Matching brackets and quotes | ✅2 | ✅ |
Keyword autocomplete | ✅2 | ✅ |
Variable autocomplete | ✅3 | |
Backreference autocomplete | ||
Character class autocomplete | ✅ | |
Unicode property autocomplete | ✅ | |
Hover tooltips | ||
Apply suggestions | ||
Share link (playground) | ✅ | ✅ |
1 indentation based
2 works in VSCode, but not in the playground
3 does not take scopes into account