String.size and $TextLength (i.e. $Text.size) report the length of the UTF-16 representation of the text of the notes. Unicode can represent more than a million distinct characters. Most such characters count as a single character, i.e. count of +1 in String.size and $TextLength. These characters in the 'base multilingual plane 'which include most modern languages and many common symbols. They include 'invisible' characters such as spaces, line returns and tabs. This also includes:
- Western or Eastern European languages based on Latin characters.
- Most Central European languages and Turkish.
- Russian (and related Cyrillic scripts), Greek, Thai, Arabic, and Hebrew.
However, some other literal characters need more data to describe them, so-called 'double-byte' characters. Each of these counts as two characters for the purposes of String.size and $TextLength. Examples are:
- Emoji.
- Asian language scripts such as Chinese, Japanese, and Korean.