Unicode Basic Multilingual Plane (BMP)

Basic Multilingual Plane (BMP)

Unicode is divided into a total of 17 code areas, each with 65,536 characters (16 bits), currently only about 10 percent of these are used. The first and most important plane is the Basic Multilingual Plane (Plane 0, BMP), which contains nearly all commonly used writing systems and symbols. It is the home of the characters U+0000 to U+FFFF.

Among other things, in the BMP, there are the Latin characters and symbols, transcriptions, other European characters and writing systems such as Greek and Cyrillic letters, African and Asian characters like Hiragana and Katakana, diacritical marks, Canadian syllables, Chinese, Japanese and Korean ideographs, symbols, and various other characters. In addition, space for private use of own-defined character is reserved in the BMP.

Supplementary Multilingual Plane (SMP)

The second level (U+10000 to U+1FFFF) is the Supplementary Multilingual Plane (Plane 1, SMP), the additional multi-lingual area. This plane contains historical writing systems and symbols that are used very rarely, for example, signs of dominoes.

Supplementary Ideographic Plane (SIP)

In the third level (U+20000 to U+2FFFF), there are exclusively Japanese, Chinese and Korean characters, which are rarely used. This layer is called the Supplementary Ideographic Plane (Plane 2, SIP) or complementary ideographic area.

Plane 3 to 13

The code range U+30000 to U+DFFFF and thus the fourth to fourteenth plane (Plane 3 to 13) has not been occupied, yet. Even if all known writing systems of the world that has not been coded would be encoded, there would be enough room left for other characters. However, there are unlimited possibilities for characters that may occur outside of the writing systems that could be encoded at some point.

Supplementary Special-purpose Plane (SSP)

The fifteenth level (Plane 14) in the code range U+E0000 to U+EFFFF is called the Supplementary Special-Purpose Plane (SSP), or the additional area for specific uses. This area contain non-graphic symbols as a few control characters for language marking (for the case that the language is not declared by other protocols like XML) and alternative glyph sectors. These can be used to specify an alternative glyph for a character that can not be determined by the context.

Private Use Area

The last two levels (U+F0000 to U+10FFFF) are reserved for private use and can be used individually. These characters are not clearly defined by the Unicode Consortium, so that they can not be used uniformly.