Skip to content

Formal grammar

This document uses Pomsky syntax to describe Pomsky’s syntax. Here’s an incomplete summary, which is enough to read the grammar:

  • Variables are declared as let var_name = expression;. This means that var_namecan be parsed by parsingexpression.

  • Verbatim text is wrapped in double quotes ("") or single quotes ('').

  • A * after a rule indicates that it repeats 0 or more times.

  • A + after a rule indicates that it repeats 1 or more times.

  • A ? after a rule indicates that the rule is optional.

  • Rules can be grouped together by wrapping them in parentheses (()).

  • Alternative rules are each preceded by a vertical bar (|).

Comments start with # and end at the end of the same line. Comments and whitespace are ignored; they can be added anywhere between tokens. Tokens are

  • identifiers (e.g. foo)
  • keywords and reserved words (e.g. lazy)
  • operators and punctuation (e.g. << or ;)
  • numbers (e.g. 30)
  • string literals (e.g. "foo")
  • codepoints

as documented here in detail.

Even though this grammar is written using Pomsky syntax, it isn’t actually accepted by the Pomsky compiler, because it uses cyclic variables.

let Expression = Statement* Alternation;

See Alternation.

let Statement =
| LetDeclaration
| Modifier
| Test;

See LetDeclaration, Modifier, Test.

An expression which can have a prefix or suffix.

let FixExpression =
| Lookaround
| Negation
| Repetition;

See Lookaround, Negation, Repetition.

let AtomExpression =
| String
| CodePoint
| Group
| CharacterSet
| InlineRegex
| Boundary
| Reference
| NumberRange
| Variable
| Dot
| Recursion;

See String, CodePoint, Group, CharacterSet, InlineRegex, Boundary, Reference, NumberRange, Dot, Recursion.