Unicode properties
Pomsky supports the following kinds of Unicode properties:
- General categories
- Scripts
- Blocks
- Other boolean properties
However, not all regex engines support all of them. In particular, blocks and other properties are poorly supported.
Note that regex requires underscores, and hyphens must be substituted with underscores.
List of General Categories
Each line contains one category and its aliases. The words in each line can be used interchangeably.
Cased_Letter
,LC
Close_Punctuation
,Pe
Connector_Punctuation
,Pc
Control
,Cc
,cntrl
Currency_Symbol
,Sc
Dash_Punctuation
,Pd
Decimal_Number
,Nd
,digit
,d
Enclosing_Mark
,Me
Final_Punctuation
,Pf
Format
,Cf
Initial_Punctuation
,Pi
Letter
,L
Letter_Number
,Nl
Line_Separator
,Zl
Lowercase_Letter
,Ll
Mark
,M
,Combining_Mark
Math_Symbol
,Sm
Modifier_Letter
,Lm
Modifier_Symbol
,Sk
Nonspacing_Mark
,Mn
Number
,N
Open_Punctuation
,Ps
Other
,C
Other_Letter
,Lo
Other_Number
,No
Other_Punctuation
,Po
Other_Symbol
,So
Paragraph_Separator
,Zp
Private_Use
,Co
Punctuation
,P
,punct
Separator
,Z
,space
,s
Space_Separator
,Zs
Spacing_Mark
,Mc
Surrogate
,Cs
Symbol
,S
Titlecase_Letter
,Lt
Unassigned
,Cn
Uppercase_Letter
,Lu
List of Scripts
Each line contains one script and its abbreviation, if it has one. The words in each line can be used interchangeably.
Adlam
,Adlm
Ahom
Anatolian_Hieroglyphs
,Hluw
Arabic
,Arab
Armenian
,Armn
Avestan
,Avst
Balinese
,Bali
Bamum
,Bamu
Bassa_Vah
,Bass
Batak
,Batk
Bengali
,Beng
Bhaiksuki
,Bhks
Bopomofo
,Bopo
Brahmi
,Brah
Braille
,Brai
Buginese
,Bugi
Buhid
,Buhd
Canadian_Aboriginal
,Cans
Carian
,Cari
Caucasian_Albanian
,Aghb
Chakma
,Cakm
Cham
Chorasmian
,Chrs
Cherokee
,Cher
Common
,Zyyy
Coptic
,Copt
Cuneiform
,Xsux
Cypriot
,Cprt
Cypro_Minoan
,Cpmn
Cyrillic
,Cyrl
Deseret
,Dsrt
Devanagari
,Deva
Dives_Akuru
,Diak
Dogra
,Dogr
Duployan
,Dupl
Egyptian_Hieroglyphs
,Egyp
Elbasan
,Elba
Elymaic
,Elym
Ethiopic
,Ethi
Georgian
,Geor
Glagolitic
,Glag
Gothic
,Goth
Grantha
,Gran
Greek
,Grek
Gujarati
,Gujr
Gunjala_Gondi
,Gong
Gurmukhi
,Guru
Han
,Hani
Hangul
,Hang
Hanifi_Rohingya
,Rohg
Hanunoo
,Hano
Hatran
,Hatr
Hebrew
,Hebr
Hiragana
,Hira
Imperial_Aramaic
,Armi
Inherited
,Zinh
Inscriptional_Pahlavi
,Phli
Inscriptional_Parthian
,Prti
Javanese
,Java
Kaithi
,Kthi
Kannada
,Knda
Katakana
,Kana
Kayah_Li
,Kali
Kharoshthi
,Khar
Khitan_Small_Script
,Kits
Khmer
,Khmr
Khojki
,Khoj
Khudawadi
,Sind
Lao
,Laoo
Latin
,Latn
Lepcha
,Lepc
Limbu
,Limb
Linear_A
,Lina
Linear_B
,Linb
Lisu
Lycian
,Lyci
Lydian
,Lydi
Mahajani
,Mahj
Makasar
,Maka
Malayalam
,Mlym
Mandaic
,Mand
Manichaean
,Mani
Marchen
,Marc
Medefaidrin
,Medf
Masaram_Gondi
,Gonm
Meetei_Mayek
,Mtei
Mende_Kikakui
,Mend
Meroitic_Cursive
,Merc
Meroitic_Hieroglyphs
,Mero
Miao
,Plrd
Modi
Mongolian
,Mong
Mro
,Mroo
Multani
,Mult
Myanmar
,Mymr
Nabataean
,Nbat
Nandinagari
,Nand
New_Tai_Lue
,Talu
Newa
Nko
,Nkoo
Nushu
,Nshu
Nyiakeng_Puachue_Hmong
,Hmnp
Ogham
,Ogam
Ol_Chiki
,Olck
Old_Hungarian
,Hung
Old_Italic
,Ital
Old_North_Arabian
,Narb
Old_Permic
,Perm
Old_Persian
,Xpeo
Old_Sogdian
,Sogo
Old_South_Arabian
,Sarb
Old_Turkic
,Orkh
Old_Uyghur
,Ougr
Oriya
,Orya
Osage
,Osge
Osmanya
,Osma
Pahawh_Hmong
,Hmng
Palmyrene
,Palm
Pau_Cin_Hau
,Pauc
Phags_Pa
,Phag
Phoenician
,Phnx
Psalter_Pahlavi
,Phlp
Rejang
,Rjng
Runic
,Runr
Samaritan
,Samr
Saurashtra
,Saur
Sharada
,Shrd
Shavian
,Shaw
Siddham
,Sidd
SignWriting
,Sgnw
Sinhala
,Sinh
Sogdian
,Sogd
Sora_Sompeng
,Sora
Soyombo
,Soyo
Sundanese
,Sund
Syloti_Nagri
,Sylo
Syriac
,Syrc
Tagalog
,Tglg
Tagbanwa
,Tagb
Tai_Le
,Tale
Tai_Tham
,Lana
Tai_Viet
,Tavt
Takri
,Takr
Tamil
,Taml
Tangsa
,Tnsa
Tangut
,Tang
Telugu
,Telu
Thaana
,Thaa
Thai
Tibetan
,Tibt
Tifinagh
,Tfng
Tirhuta
,Tirh
Toto
Ugaritic
,Ugar
Vai
,Vaii
Vithkuqi
,Vith
Wancho
,Wcho
Warang_Citi
,Wara
Yezidi
,Yezi
Yi
,Yiii
Zanabazar_Square
,Zanb
List of Blocks
InBasic_Latin
InLatin_1_Supplement
InLatin_Extended_A
InLatin_Extended_B
InIPA_Extensions
InSpacing_Modifier_Letters
InCombining_Diacritical_Marks
InGreek_and_Coptic
InCyrillic
InCyrillic_Supplementary
InArmenian
InHebrew
InArabic
InSyriac
InThaana
InDevanagari
InBengali
InGurmukhi
InGujarati
InOriya
InTamil
InTelugu
InKannada
InMalayalam
InSinhala
InThai
InLao
InTibetan
InMyanmar
InGeorgian
InHangul_Jamo
InEthiopic
InCherokee
InUnified_Canadian_Aboriginal_Syllabics
InOgham
InRunic
InTagalog
InHanunoo
InBuhid
InTagbanwa
InKhmer
InMongolian
InLimbu
InTai_Le
InKhmer_Symbols
InPhonetic_Extensions
InLatin_Extended_Additional
InGreek_Extended
InGeneral_Punctuation
InSuperscripts_and_Subscripts
InCurrency_Symbols
InCombining_Diacritical_Marks_for_Symbols
InLetterlike_Symbols
InNumber_Forms
InArrows
InMathematical_Operators
InMiscellaneous_Technical
InControl_Pictures
InOptical_Character_Recognition
InEnclosed_Alphanumerics
InBox_Drawing
InBlock_Elements
InGeometric_Shapes
InMiscellaneous_Symbols
InDingbats
InMiscellaneous_Mathematical_Symbols_A
InSupplemental_Arrows_A
InBraille_Patterns
InSupplemental_Arrows_B
InMiscellaneous_Mathematical_Symbols_B
InSupplemental_Mathematical_Operators
InMiscellaneous_Symbols_and_Arrows
InCJK_Radicals_Supplement
InKangxi_Radicals
InIdeographic_Description_Characters
InCJK_Symbols_and_Punctuation
InHiragana
InKatakana
InBopomofo
InHangul_Compatibility_Jamo
InKanbun
InBopomofo_Extended
InKatakana_Phonetic_Extensions
InEnclosed_CJK_Letters_and_Months
InCJK_Compatibility
InCJK_Unified_Ideographs_Extension_A
InYijing_Hexagram_Symbols
InCJK_Unified_Ideographs
InYi_Syllables
InYi_Radicals
InHangul_Syllables
InHigh_Surrogates
InHigh_Private_Use_Surrogates
InLow_Surrogates
InPrivate_Use_Area
InCJK_Compatibility_Ideographs
InAlphabetic_Presentation_Forms
InArabic_Presentation_Forms_A
InVariation_Selectors
InCombining_Half_Marks
InCJK_Compatibility_Forms
InSmall_Form_Variants
InArabic_Presentation_Forms_B
InHalfwidth_and_Fullwidth_Forms
InSpecials
List of Other Supported Properties
White_Space
Alphabetic
,Alpha
Noncharacter_Code_Point
Default_Ignorable_Code_Point
Logical_Order_Exception
Deprecated
Variation_Selector
Uppercase
,upper
Lowercase
,lower
Soft_Dotted
Case_Ignorable
Changes_When_Lowercased
Changes_When_Uppercased
Changes_When_Titlecased
Changes_When_Casefolded
Changes_When_Casemapped
Emoji
Emoji_Presentation
Emoji_Modifier
Emoji_Modifier_Base
Emoji_Component
Extended_Pictographic
Hex_Digit
ASCII_Hex_Digit
Join_Control
Joining_Group
Bidi_Control
Bidi_Mirrored
Bidi_Mirroring_Glyph
ID_Continue
ID_Start
XID_Continue
XID_Start
Pattern_Syntax
Pattern_White_Space
Ideographic
Unified_Ideograph
Radical
IDS_Binary_Operator
IDS_Trinary_Operator
Math
Quotation_Mark
Dash
Sentence_Terminal
Terminal_Punctuation
Diacritic
Extender
Grapheme_Base
Grapheme_Extend
Regional_Indicator