Character (computing)

In computer and machine-based telecommunications terminology, a character is a unit of information that roughly corresponds to a grapheme, grapheme-like unit, or symbol, such as in an alphabet or syllabary in the written form of a natural language.[1]

Examples of characters include letters, numerical digits, common punctuation marks (such as "." or "-"), and whitespace. The concept also includes control characters, which do not correspond to visible symbols but rather to instructions to format or process the text. Examples of control characters include carriage return or tab, as well as instructions to printers or other devices that display or otherwise process text.

Characters are typically combined into strings.

Historically, the term character was also used to just denote a specific number of contiguous bits. While a character is most commonly assumed to refer to 8 bits (one byte) today, other definitions, like 6-bit character code was once popular (using only upper case, while enough bits to also represent lower case, not with numbers and punctuation allowed for),[2][3] and even 5-bit Baudot code have been used in the past as well, and while the term has also been applied to 4 bits[4] with only 16 possible values, it wasn't meant to, nor can, represent the full English alphabet. See also Universal Character Set characters, where 8 bits are not enough to represent, while all can be represented with one or more 8-bit code units with UTF-8.

  1. ^ Cite error: The named reference MW_Definition was invoked but never defined (see the help page).
  2. ^ Cite error: The named reference Dreyfus_1958_Gamma60 was invoked but never defined (see the help page).
  3. ^ Cite error: The named reference Buchholz_1962 was invoked but never defined (see the help page).
  4. ^ Cite error: The named reference Intel_1973_MCS-4 was invoked but never defined (see the help page).