Formal grammar


This document uses Pomsky syntax to describe Pomsky’s syntax. Here’s an incomplete summary, which is enough to read the grammar:

  • Variables are declared as let var_name = expression;. This means that var_name can be parsed by parsing expression.

  • Verbatim text is wrapped in double quotes ("") or single quotes ('').

  • A * after a rule indicates that it repeats 0 or more times.

  • A + after a rule indicates that it repeats 1 or more times.

  • A ? after a rule indicates that the rule is optional.

  • Rules can be grouped together by wrapping them in parentheses (()).

  • Alternative rules are each preceded by a vertical bar (|).

Formal grammar

Comments start with # and end at the end of the same line. Comments and whitespace are ignored; they can be added anywhere between tokens. Tokens are

  • identifiers (e.g. foo)
  • keywords and reserved words (e.g. lazy)
  • operators and punctuation (e.g. << or ;)
  • numbers (e.g. 30)
  • string literals (e.g. "foo")
  • codepoints

as documented here in detail.

Note about this grammar

Even though this grammar is written using Pomsky syntax, it isn’t actually accepted by the Pomsky compiler, because it uses cyclic variables.


let Expression = Statement* Alternation;

See Alternation.


let Statement =
    | LetDeclaration
    | Modifier
    | Test;

See LetDeclaration, Modifier, Test.


An expression which can have a prefix or suffix.

let FixExpression =
    | Lookaround
    | Negation
    | Repetition;

See Lookaround, Negation, Repetition.


let AtomExpression =
    | String
    | CodePoint
    | Group
    | CharacterSet
    | InlineRegex
    | Boundary
    | Reference
    | NumberRange
    | Variable
    | Dot
    | Recursion;

See String, CodePoint, Group, CharacterSet, InlineRegex, Boundary, Reference, NumberRange, Dot, Recursion.