Ranges
On this page
Writing a regex matching a number in a certain range can be quite difficult. For example, the following regex matches a number between 0 and 255:
(?:2(?:5[0-5]|[0-4][0-9])|1[0-9][0-9]|[1-9]?[0-9])
This has many downsides:
- It’s not readable
- It’s difficult to come up with
- It’s easy to make a mistake somewhere
- It’s inefficient; a typical regex engine needs to backtrack in several places
Pomsky solves these problems with its range
syntax:
range '0'-'255'
Pomsky creates a DFA (deterministic finite automaton) from this, so the generated regex is
optimal in terms of matching performance. Since the algorithm for creating this regex is extensively
tested, you can also rely on it’s correctness. Here’s the regex generated from
range '0'-'255'
:
0|1[0-9]{0,2}|2(?:[0-4][0-9]?|5[0-5]?|[6-9])?|[3-9][0-9]?
Different bases
Pomsky can generate ranges in various bases. For example, to match hexadecimal numbers in a certain range, you might write:
range '10F'-'FFFF' base 16
This generates this regex:
1(?:0(?:[0-9a-eA-E][0-9a-fA-F]|[fF][0-9a-fA-F]?)|[1-9a-fA-F][0-9a-fA-F]{1,2})|[2-9a-fA-F][0-9a-fA-F]{2,3}
Leading zeroes
If you wish to support leading zeros, this is easy to achieve by putting '0'*
in front:
'0'* range '0'-'1024'
If the number should have a certain length, with leading zeroes added when necessary, pomsky has a special syntax for this:
range '0000'-'1024'
This matches numbers in the specified range with exactly 4 digits, such as 0110
or 0026
.