Hi,
I need to convert between different character sets,
but didn’t find any library to do so except for
ruby-gnome’s glib.convert
Is there any character conversion library which
doesn’t come with a complete graphical library?
regards
Hadmut
Hi,
I need to convert between different character sets,
but didn’t find any library to do so except for
ruby-gnome’s glib.convert
Is there any character conversion library which
doesn’t come with a complete graphical library?
regards
Hadmut
“Hadmut Danisch” spamblock@danisch.de skrev i en meddelelse
news:c0j99t$t9p$04$1@news.t-online.com…
Hi,
I need to convert between different character sets,
but didn’t find any library to do so except for
ruby-gnome’s glib.convertIs there any character conversion library which
doesn’t come with a complete graphical library?
I’m not aware of any but that isn’t to say there isn’t one
I assume you have checked out “iconv” which I have no experience with.
There is a good code page tutorial here - follow a few links if you need.
http://www.cs.tut.fi/~jkorpela/chars.html
The utf-8 format is easily decomposed into UCS2, and from there it is fairly
easy to go to 8859-1 because it is only 256 characters and most of them are
in the lower 8 bytes of UCS2.
You should btw. also consider 8859-9 (I think it is) it’s basically 8859-1
with the euro sign.
Mikkel
Hi,
I need to convert between different character sets,
but didn’t find any library to do so except for
ruby-gnome’s glib.convertIs there any character conversion library which
doesn’t come with a complete graphical library?
Between these two encodings, you can use, without any external library:
utf8string.unpack(“U*”).pack(“c*”) # => latin1 string
latin1string.unpack(“C*”).pack(“U*”) # => utf8 string
I need to convert between different character sets,
but didn’t find any library to do so except for
ruby-gnome’s glib.convertIs there any character conversion library which
doesn’t come with a complete graphical library?I’m not aware of any but that isn’t to say there isn’t one
I assume you have checked out “iconv” which I have no experience with.
iconv sounds like the tool to me.
You should btw. also consider 8859-9 (I think it is) it’s basically 8859-1
with the euro sign.
ISO-8859-15, which has updated french and German characters, and the
Euro. -9 is non-roman.
Ari
Aredridel wrote:
I need to convert between different character sets,
but didn’t find any library to do so except for
ruby-gnome’s glib.convert
…
iconv sounds like the tool to me.
Or, if all you’re doing is converting iso-8859-1 to UTF, you could use
String::unpack and Array.pack:
# ASCII (ISO-8859-1) -> UTF:
string.unpack("C*").pack("U*")
# UTF-8 -> ISO-8859-1
string.unpack("U*").pack("C*")
Of course, the conversion from UTF-8 to ISO-8859-1 won’t work in all cases,
because the character space of UTF-8 is larger than ISO-8859-1. Going from
ASCII to UTF-8 should always work, though.
— SER