Formal grammar

Summary

This document uses Pomsky syntax to describe Pomsky’s syntax. Here’s an incomplete summary, which is enough to read the grammar:

  • Variables are declared as let var_name = expression;. This means that var_name can be parsed by parsing expression.

  • Verbatim text is wrapped in double quotes ("") or single quotes ('').

  • A * after a rule indicates that it repeats 0 or more times.

  • A + after a rule indicates that it repeats 1 or more times.

  • A ? after a rule indicates that the rule is optional.

  • Rules can be grouped together by wrapping them in parentheses (()).

  • Alternative rules are each preceded by a vertical bar (|).

Formal grammar

Comments start with # and end at the end of the same line. Comments and whitespace are ignored; they can be added anywhere between tokens. Tokens are

  • identifiers (e.g. foo)
  • keywords and reserved words (e.g. lazy)
  • operators and punctuation (e.g. << or ;)
  • numbers (e.g. 30)
  • string literals (e.g. "foo")
  • codepoints

as documented here in detail.

Note about this grammar

Even though this grammar is written using Pomsky syntax, it isn’t actually accepted by the Pomsky compiler, because it uses cyclic variables.

Expression

let Expression = Statement* Alternation;

See Alternation.

Statement

let Statement =
    | LetDeclaration
    | Modifier
    | Test;

See LetDeclaration, Modifier, Test.

FixExpression

An expression which can have a prefix or suffix.

let FixExpression =
    | Lookaround
    | Negation
    | Repetition;

See Lookaround, Negation, Repetition.

AtomExpression

let AtomExpression =
    | String
    | CodePoint
    | Group
    | CharacterSet
    | InlineRegex
    | Boundary
    | Reference
    | NumberRange
    | Variable
    | Dot
    | Recursion;

See String, CodePoint, Group, CharacterSet, InlineRegex, Boundary, Reference, NumberRange, Dot, Recursion.