Individual char values in a Unicode string

I'm trying to figure out how to use [] String or jconv or something to figure out the actual code-point values in a Unicode/UTF-8 string. For example, how can I write f such that

f('tö中') ==> [ 0x74, 0xf6, 0x4e2d ]

(hex just for clarity of course, I want numbers).

  -Tim

Tim Bray wrote:

I'm trying to figure out how to use [] String or jconv or something
to figure out the actual code-point values in a Unicode/UTF-8
string. For example, how can I write f such that

f('tö中') ==> [ 0x74, 0xf6, 0x4e2d ]

(hex just for clarity of course, I want numbers).

Hex numbers are numbers. :slight_smile:

To answer your question, you can extract bytes from a string:

#!/usr/bin/ruby

s = "this is a test"

i = 0
while (i < s.size)
        puts s[i] # emits numbers, not characters
        i += 1
end

Bu I don't think Ruby recognizes characters, Unicode or otherwise. So it may
not be able to interpret a mixture of Unicode and UTF/8 without explicit
code from the programmer.

···

--
Paul Lutus
http://www.arachnoid.com

Tim Bray wrote:

I'm trying to figure out how to use [] String or jconv or something to figure out the actual code-point values in a Unicode/UTF-8 string. For example, how can I write f such that

f('tö中') ==> [ 0x74, 0xf6, 0x4e2d ]

(hex just for clarity of course, I want numbers).

-Tim

'tö中'.unpack("U*") => [116, 246, 20013]

Regards,

Dan