To begin with, I do not know who Andy Roonie is,
I think it is worth while to point out some
of the serious problems relating to natural
languages , bytes, fonts , chars , glyphs and all
As I see it, the Unicode effort has been deeply
misguided right from the beginning and the Java
brotherhood has misunderstood the thing as well:
(1) in many languages there are more glyphs than
letters in the alphabet, e.g. because of
i.e. letters that get intertwined with their
neighbors.( take Hindi or Arabic as examples)
Unicode does not cater for this.
(2) Diacritics are not everywhere as simple as
accents in French, umlauts in German , which
luckily could be fit into Latin-1.
(3) Some languages are written from left to right,
some from the top down and texts may be mixed.
(4) Some historic languages are even written in
bustrophic style or even have symbols that face
left or right depending on context like
Please consider that a multilingual text editor
must know about the [possibly varying] glyph bindings
of all of its
(5) Japanese, as you probably know, has the rich
of kanji characters and the two kana alphabets,
but no ligatures.
(6) Collating sequences are a nontrivial issue.
In classical Spanish, e.g. LL and CH are
(7) Real and virtual keyboards are a major issue:
There are “latinized” keyboards that allow non-native
speakers of Greek, Russian, Arabic, Hebrew etc. to
find the equivalents of, say, English letters in
the same places. Diacritics like accents also
impact on the keyboard.
(8) UTF-8 and open-ended variable-length encoding
is obviously the way to go, but I wish I knew who
is looking after these things in Ruby and how far
they’ve got. (as far as I know, as of now, you do not
get cut and dried solutions in Java yet)
(9) How about Cgi scripts and Tk Guis in Urdu?
Do You Yahoo!?
Yahoo! - Official partner of 2002 FIFA World Cup