help | character | properties | confusables | unicode-set | compare-sets | regex | bnf-regex | breaks | transform | bidi | bidi-c | idna | languageid
Category | Datatype | Source | Property | Values |
---|---|---|---|---|
Bidirectional | Binary | UCD | Bidi_Control | No (N), Yes (Y) |
Bidi_Mirrored | No (N), Yes (Y) | |||
Enumerated | Bidi_Class | Show Values | ||
Bidi_Paired_Bracket_Type | Close (C), None (N), Open (O) | |||
String | Bidi_Mirroring_Glyph | Show Values | ||
Bidi_Paired_Bracket | Show Values | |||
Case | Binary | ICU | Case_Sensitive | No (N), Yes (Y) |
UCD | Case_Ignorable | No (N), Yes (Y) | ||
Cased | No (N), Yes (Y) | |||
Changes_When_Casefolded | No (N), Yes (Y) | |||
Changes_When_Casemapped | No (N), Yes (Y) | |||
Changes_When_Lowercased | No (N), Yes (Y) | |||
Changes_When_Titlecased | No (N), Yes (Y) | |||
Changes_When_Uppercased | No (N), Yes (Y) | |||
Lowercase | No (N), Yes (Y) | |||
Soft_Dotted | No (N), Yes (Y) | |||
Uppercase | No (N), Yes (Y) | |||
Unicode | isCased | No (N), Yes (Y) | ||
isCasefolded | No (N), Yes (Y) | |||
isLowercase | No (N), Yes (Y) | |||
isTitlecase | No (N), Yes (Y) | |||
isUppercase | No (N), Yes (Y) | |||
String | UCD | Case_Folding | Show Values | |
Lowercase_Mapping | Show Values | |||
Simple_Case_Folding | Show Values | |||
Simple_Lowercase_Mapping | Show Values | |||
Simple_Titlecase_Mapping | Show Values | |||
Simple_Uppercase_Mapping | Show Values | |||
Titlecase_Mapping | Show Values | |||
Uppercase_Mapping | Show Values | |||
Unicode | toCasefold | Show Values | ||
toLowercase | Show Values | |||
toTitlecase | Show Values | |||
toUppercase | Show Values | |||
CJK | Binary | UCD | IDS_Binary_Operator | No (N), Yes (Y) |
IDS_Trinary_Operator | No (N), Yes (Y) | |||
Ideographic | No (N), Yes (Y) | |||
Radical | No (N), Yes (Y) | |||
Unified_Ideograph | No (N), Yes (Y) | |||
Enumerated | X-Demo | HanType | Han, Hans, Hant, na | |
String | UCD | CJK_Radical | Show Values | |
kSimplifiedVariant | 万 (万), 与 (与), 丑 (丑), 专 (专), 业 (业), 丛 (丛), 东 (东), 丝 (丝), 丢 (丢), 两 (两), 严 (严), 丽 (丽), 丧 (丧), 个 (个), 丰 (丰), 临 (临), 义 (义), 为 (为), 举 (举), 么 (么), 乌 (乌), 乐 (乐), 乔 (乔), 习 (习), 书 (书), 买 (买), 乱 (乱), 争 (争), 于 (于), 亏 (亏), 云 (云), 亚 (亚), 产 (产), 亩 (亩), 亲 (亲), 亵 (亵), 亸 (亸), 亿 (亿), 仅 (仅), 仆 (仆), 从 (从), 仑 (仑), 仓 (仓), 仪 (仪), 们 (们), 价 (价), 众 (众), 优 (优), 会 (会), 伛 (伛), 伞 (伞), 伟 (伟), 传 (传), 伡 (伡), 伣 (伣), 伤 (伤), 伥 (伥), 伦 (伦), 伧 (伧), 伪 (伪), 伫 (伫), 㐽 (㐽), 𠆿 (𠆿), 体 (体), 余 (余), 余馀 (余馀), 佣 (佣), 佥 (佥), 㑇 (㑇), 㑈 (㑈), 侠 (侠), 侣 (侣), 侥 (侥), 侦 (侦), 侧 (侧), 侨 (侨), 侩 (侩), 侪 (侪), 侬 (侬), 㑔 (㑔), 俣 (俣), 俦 (俦), 俨 (俨), 俩 (俩), 俪 (俪), 俫 (俫), 俭 (俭), 𠉂 (𠉂), 𠉗 (𠉗), 债 (债), 倾 (倾), 㑩 (㑩), 偬 (偬), 偻 (偻), 偾 (偾), 偿 (偿), 傥 (傥), 傧 (傧), 储 (储), 傩 (傩), 儿 (儿), 克 (克), 兑 (兑), 兖 (兖), 党 (党), 兰 (兰), 关 (关), 兴 (兴), 兹 (兹), 养 (养), 兽 (兽), 冁 (冁), 内 (内), 冈 (冈), 册 (册), 写 (写), 军 (军), 农 (农), 冯 (冯), 冲 (冲), 决 (决), 况 (况), 冻 (冻), 𪞝 (𪞝), 净 (净), 准 (准), 凉 (凉), 减 (减), 凑 (凑), 凛 (凛), 几 (几), 凤 (凤), 凫 (凫), 凭 (凭), 凯 (凯), 击 (击), 凿 (凿), 刍 (刍), 𠚳 (𠚳), 划 (划), 刘 (刘), 则 (则), 刚 (刚), 创 (创), 𠛅 (𠛅), 𠛆 (𠛆), 删 (删), 别 (别), 刬 (刬), 刭 (刭), 刮 (刮), 制 (制), 刹 (刹), 刽 (刽), 刾 (刾), 刿 (刿), 剀 (剀), 剂 (剂), 㓥 (㓥), 剐 (剐), 剑 (剑), 剥 (剥), 剧 (剧), 㔉 (㔉), 劝 (劝), 办 (办), 务 (务), 劢 (劢), 动 (动), 励 (励), 劲 (劲), 劳 (劳), 势 (势), 勋 (勋), 勚 (勚), 匀 (匀), 匦 (匦), 匮 (匮), 区 (区), 医 (医), 华 (华), 协 (协), 单 (单), 卖 (卖), 卢 (卢), 卤 (卤), 卫 (卫), 却 (却), 厂 (厂), 厅 (厅), 历 (历), 厉 (厉), 压 (压), 厌 (厌), 厍 (厍), 厐 (厐), 厕 (厕), 厘 (厘), 厢 (厢), 厣 (厣), 厩 (厩), 厦 (厦), 厨 (厨), 厮 (厮), 县 (县), 叁 (叁), 参 (参), 双 (双), 发 (发), 变 (变), 叙 (叙), 叠 (叠), 只 (只), 台 (台), 叶 (叶), 号 (号), 叹 (叹), 叽 (叽), 同 (同), 后 (后), 向 (向), 吓 (吓), 吕 (吕), 吗 (吗), 吣 (吣), 吨 (吨), 听 (听), 启 (启), 吴 (吴), 呐 (呐), 呒 (呒), 呓 (呓), 呕 (呕), 呖 (呖), 呗 (呗), 员 (员), 呙 (呙), 呛 (呛), 呜 (呜), 𠯟 (𠯟), 𠯠 (𠯠), 咏 (咏), 咙 (咙), 咛 (咛), 咝 (咝), 咤 (咤), 咸 (咸), 响 (响), 哑 (哑), 哒 (哒), 哓 (哓), 哔 (哔), 哕 (哕), 哗 (哗), 哙 (哙) too many values to show | |||
kTraditionalVariant | Show Values | |||
Emoji | Binary | UTS | Emoji | No (N), Yes (Y) |
Emoji_Component | No (N), Yes (Y) | |||
Emoji_Flag_Sequence | No (No), Yes (Yes) | |||
Emoji_Keycap_Sequence | No (No), Yes (Yes) | |||
Emoji_Modifier | No (N), Yes (Y) | |||
Emoji_Modifier_Base | No (N), Yes (Y) | |||
Emoji_Modifier_Sequence | No (No), Yes (Yes) | |||
Emoji_Presentation | No (N), Yes (Y) | |||
Emoji_Tag_Sequence | No (No), Yes (Yes) | |||
Emoji_Zwj_Sequence | No (No), Yes (Yes) | |||
Enumerated | UCD | Regional_Indicator | No (N), Yes (Y) | |
General | Binary | UCD | Alphabetic | No (N), Yes (Y) |
Default_Ignorable_Code_Point | No (N), Yes (Y) | |||
Deprecated | No (N), Yes (Y) | |||
Logical_Order_Exception | No (N), Yes (Y) | |||
Noncharacter_Code_Point | No (N), Yes (Y) | |||
Variation_Selector | No (N), Yes (Y) | |||
White_Space | No (N), Yes (Y) | |||
Catalog | Age | Show Values | ||
Block | Show Values | |||
Script | Show Values | |||
Enumerated | General_Category | Show Values | ||
Hangul_Syllable_Type | Leading_Jamo (L), LV_Syllable (LV), LVT_Syllable (LVT), Not_Applicable (NA), Trailing_Jamo (T), Vowel_Jamo (V) | |||
Name_Alias | Show Values | |||
Named_Sequences | Show Values | |||
Named_Sequences_Prov | ||||
String | Nameslist | subhead | Show Values | |
UCD | Name | Show Values | ||
Script_Extensions | Show Values | |||
Identifiers | Binary | UCD | ID_Continue | No (N), Yes (Y) |
ID_Start | No (N), Yes (Y) | |||
Pattern_Syntax | No (N), Yes (Y) | |||
Pattern_White_Space | No (N), Yes (Y) | |||
XID_Continue | No (N), Yes (Y) | |||
XID_Start | No (N), Yes (Y) | |||
IDNA | Enumerated | UTS | Idn_2008 | na (na), NV8 (nv8), XV8 (xv8) |
Idn_Status | deviation (dv), disallowed (da), disallowed_STD3_mapped (ds3m), disallowed_STD3_valid (ds3v), ignored (i), mapped (m), valid (v) | |||
idna2003 | deviation, disallowed, ignored, mapped, valid | |||
idna2008 | CONTEXTJ, CONTEXTO, DISALLOWED, PVALID, UNASSIGNED | |||
idna2008c | deviation, disallowed, ignored, mapped, valid | |||
uts46 | deviation, disallowed, ignored, mapped, valid | |||
String | Idn_Mapping | Show Values | ||
toIdna2003 | Show Values | |||
toUts46n | Show Values | |||
toUts46t | Show Values | |||
Miscellaneous | Binary | UCD | Dash | No (N), Yes (Y) |
Diacritic | No (N), Yes (Y) | |||
Extender | No (N), Yes (Y) | |||
Grapheme_Base | No (N), Yes (Y) | |||
Grapheme_Extend | No (N), Yes (Y) | |||
Grapheme_Link | No (N), Yes (Y) | |||
Hyphen | No (N), Yes (Y) | |||
Math | No (N), Yes (Y) | |||
Quotation_Mark | No (N), Yes (Y) | |||
Sentence_Terminal | No (N), Yes (Y) | |||
Terminal_Punctuation | No (N), Yes (Y) | |||
Enumerated | Indic_Positional_Category | Show Values | ||
Indic_Syllabic_Category | Show Values | |||
Miscellaneous | ISO_Comment | Show Values | ||
Unicode_1_Name | Show Values | |||
Normalization | Binary | ICU | NFC_Inert | No (N), Yes (Y) |
NFD_Inert | No (N), Yes (Y) | |||
NFKC_Inert | No (N), Yes (Y) | |||
NFKD_Inert | No (N), Yes (Y) | |||
isNFM | No, Yes | |||
UCD | Changes_When_NFKC_Casefolded | No (N), Yes (Y) | ||
Full_Composition_Exclusion | No (N), Yes (Y) | |||
Unicode | isNFC | No, Yes | ||
isNFD | No, Yes | |||
isNFKC | No, Yes | |||
isNFKD | No, Yes | |||
Enumerated | ICU | Lead_Canonical_Combining_Class | Show Values | |
Trail_Canonical_Combining_Class | Show Values | |||
UCD | Canonical_Combining_Class | Show Values | ||
Decomposition_Type | Show Values | |||
NFC_Quick_Check | Maybe (M), No (N), Yes (Y) | |||
NFD_Quick_Check | No (N), Yes (Y) | |||
NFKC_Quick_Check | Maybe (M), No (N), Yes (Y) | |||
NFKD_Quick_Check | No (N), Yes (Y) | |||
String | ICU | toNFM | Show Values | |
UCD | NFKC_Casefold | Show Values | ||
Unicode | toNFC | Show Values | ||
toNFD | Show Values | |||
toNFKC | Show Values | |||
toNFKD | Show Values | |||
Numeric | Binary | UCD | ASCII_Hex_Digit | No (N), Yes (Y) |
Hex_Digit | No (N), Yes (Y) | |||
Enumerated | Numeric_Type | Decimal (De), Digit (Di), None (None), Numeric (Nu) | ||
kAccountingNumeric | Show Values | |||
kOtherNumeric | Show Values | |||
kPrimaryNumeric | Show Values | |||
Numeric | Numeric_Value | Show Values | ||
Regex | Binary | UTS | ANY | No, Yes |
ASCII | No, Yes | |||
alnum | No (N), Yes (Y) | |||
blank | No (N), Yes (Y) | |||
bmp | No, Yes | |||
graph | No (N), Yes (Y) | |||
No (N), Yes (Y) | ||||
xdigit | No (N), Yes (Y) | |||
Security | Enumerated | UTS | Confusable_MA | Show Values |
Identifier_Status | Allowed (a), Restricted (r) | |||
Identifier_Type | Show Values | |||
Shaping and Rendering | Binary | ICU | Segment_Starter | No (N), Yes (Y) |
UCD | Join_Control | No (N), Yes (Y) | ||
Enumerated | East_Asian_Width | Ambiguous (A), Fullwidth (F), Halfwidth (H), Narrow (Na), Neutral (N), Wide (W) | ||
Grapheme_Cluster_Break | Show Values | |||
Joining_Group | Show Values | |||
Joining_Type | Dual_Joining (D), Join_Causing (C), Left_Joining (L), Non_Joining (U), Right_Joining (R), Transparent (T) | |||
Line_Break | Show Values | |||
Prepended_Concatenation_Mark | No (N), Yes (Y) | |||
Sentence_Break | Show Values | |||
Standardized_Variant | Show Values | |||
Vertical_Orientation | Rotated (R), Transformed_Rotated (Tr), Transformed_Upright (Tu), Upright (U) | |||
Word_Break | Show Values | |||
UCA | Binary | UTS | uca | Show Values |
uca2 | Show Values | |||
uca2.5 | Show Values | |||
uca3 | Show Values | |||
Z-Other | Other | Other | Basic_Emoji | Other |
Equivalent_Unified_Ideograph | Other | |||
Extended_Pictographic | Other |
The Categories are from UCD Table 8. Property Summary Table, with some extended categories: Emoji, IDNA, Regex, Security, and UCA.
The Datatypes are from UCD Table 5. Property Type Key.
The Sources are:
Fonts and Display. If you don't have a good set of Unicode fonts (and modern browser), you may not be able to read some of the characters. Some suggested fonts that you can add for coverage are: Noto Fonts site, Unicode Fonts for Ancient Scripts, Large, multi-script Unicode fonts. See also: Unicode Display Problems.
Version 3.9; ICU version: 63.1; Unicode version: 12.0;