15. High bits are used to indicate how many bytes are used to
represent a speciﬁc character. Software can easily read a
UTF8 stream, even starting in the middle.
19. Most spoken languages can be represented in 3 bytes,
the "Basic Multilingual Plane"
21. The tengwar font has been proposed for the Unicode standard. The codepoints
are subject to change; the range U+016080 to U+0160FF in the SMP is
tentatively allocated for tengwar according to the current Unicode roadmap.
22. You need to have an appropriate font installed
to use unicode.
26. HTTP headers
• You can specify what character set you
want back when you send a form post
• This is informational for the server
• Just setting these won’t change how your
app behaves, unless your web app has code