Unicode in Ruby now?

I beg your pardon,
you have all the wonderful standards at your
fingertips.
What is meant by “representing” a character?
How are glyphs mapped to code points?
Is that a many to many mapping?
What are the attributes of a glyph?
What are the attributes of a code point?
Outside of natural language text processing,
are there areas where the parsing of
non-Latin-1 strings is relevant? If so,
what are they?
Please help my ignorance
Jan

···

Ecclesiastes 1:9The thing that hath been, it is that
which shall be; and that which is done is that which
shall be done: and there is no new thing under the
sun.

The King James Version (Authorized)


Do You Yahoo!?
Yahoo! Health - Feel better, live better

Hi,

I beg your pardon,
you have all the wonderful standards at your
fingertips.

Probably you’re asking Curt, but I will answer what I can.

What is meant by “representing” a character?
What are the attributes of a code point?

A code point is a number index to a character. “representing” a
character means encoding, for example:

Japanese Hiragana “Ka” has a code point 9252 in JIS
EUC encoded “Ka” is “\xa4\xab”.

Japanese Hiragana “Ka” has a code point 12363 in Unicode
UTF-8 encoded “Ka” is “\xe3\x81\x8b”.

Outside of natural language text processing,
are there areas where the parsing of
non-Latin-1 strings is relevant? If so,
what are they?

Because some people in the world need it to represent their daily
text. My mail, memo, journal, and almost everything are written in
non-Latin-1 string (EUC-JP).

						matz.
···

In message “Re: Unicode in Ruby now?” on 02/08/02, Jan Witt ontologist_2000@yahoo.com writes: