Go backward to 2 Some philosophical remarks on notations
Go up to Top
Go forward to 4 SGML markup for CASL
3 Character encodings
When putting notation on a computer we have to rely upon a certain
encoding of symbols. The most common ones (at present) is the ASCII
7-bit character set, with a series of 8-bit extensions known as ISO-8859.
The one we are most
familiar with is the Latin-1 version. This contains most of the letters
in use in western Europe, such as the German umlauts (ä, Ä, ...),
the French accents (á, à, â, ç, ...), the Scandinavian
compounds (æ, Æ, ...), the Icelandic soft d
and soft t
. Other Latin
characters like the palatal (?) consonants in Latvian
, letters
used in Lithuanian, Czech,
Crotian, Lapp, ..., Polish,
Esthonian,
, the Rumanian accented letters, etc.
are placed in other ISO Latin character sets. Versions of ISO-8859
also extend the 7-bit ASCII character set with Greek, Arabic and
Hebrew characters.
There are also more
encompassing character encodings than the ISO Latin series, e.g.,
the ISO-10646 Universal Multiple-Octet Coded Character Set (UCS) or
the Unicode standard
that Java chose (ISO-10646 and Unicode are identical for the parts
that have been defined).
CoFI
Note: T-1 ---- 7 April 1997.
Comments to Magne.Haveraaen@ii.uib.no