String to UTF

How to convert a single string into UTF format.

str = "test"
strUTF = str. ???

Thanks

How to convert a single string into UTF format.

I have made a utf8 encoder/decoder… fetch these
files: ‘unicode.rb’ and ‘misc.rb’
http://rubyforge.org/cgi-bin/viewcvs/cgi/viewcvs.cgi/projects/experimental/unicode/?cvsroot=aeditor

ruby a.rb
“f\345 r\370dgr\370d med fl\370deovertr\346k”
“f\303\245 r\303\270dgr\303\270d med fl\303\270deovertr\303\246k”
cat a.rb
require ‘unicode’
text = “få rødgrød med flødeovertræk”
p text
p text.to_utf8

···

On Fri, 19 Dec 2003 16:59:53 +0100, Jean-Baptiste wrote:


Simon Strandgaard

I assume you mean UTF-8? There is also UTF-7, UTF-16, and UTF-32,
where the number refers to the number of bits per character in the
result.

Use the iconv library, which comes with Ruby 1.8. You have to know
what encoding you’re starting with - ISO-8859-1 (a.k.a Latin-1) or
whatever your local character set is. Note the order of arguments
in Iconv.open: the target character set comes first, then the source
character set.

irb(main):001:0> require 'iconv'
=> true
irb(main):002:0> convertor = Iconv.open("UTF-8", "ISO-8859-1")
=> #<Iconv:0x401f5030>
irb(main):003:0> strUTF = convertor.iconv("test")
=> "test"

Which isn’t much of a test since the UTF-8 of “test” is just “test”.
This is a better demonstration:

    irb(main):004:0> convertor.iconv("¡Hola!")
    => "\302\241Hola!"

The two-byte sequence “\302\241” - that is, a byte whose value is
octal 302 = decimal 194 = hexadecimal C2 followed by a byte whose value is
octal 241 = decimal 161 = hexadecimal A1, is, in fact, the UTF-8 encoding
of the character U+00A1 INVERTED EXCLAMATION MARK.

-Mark

···

On Fri, Dec 19, 2003 at 04:59:53PM +0100, Jean-Baptiste wrote:

How to convert a single string into UTF format.

Hi,

···

At Sat, 20 Dec 2003 02:56:55 +0900, Mark J. Reed wrote:

Which isn’t much of a test since the UTF-8 of “test” is just “test”.
This is a better demonstration:

    irb(main):004:0> convertor.iconv("¡Hola!")
    => "\302\241Hola!"

The two-byte sequence “\302\241” - that is, a byte whose value is
octal 302 = decimal 194 = hexadecimal C2 followed by a byte whose value is
octal 241 = decimal 161 = hexadecimal A1, is, in fact, the UTF-8 encoding
of the character U+00A1 INVERTED EXCLAMATION MARK.

Only for ISO-8859-1, iconv is not needed.

$ ruby -e ‘p “\xa1Hola!”.unpack(“C*”).pack(“U*”)’
“\302\241Hola!”


Nobu Nakada

Mark J. Reed wrote:

Use the iconv library, which comes with Ruby 1.8.

Is iconv available on Windows 1.8.0 PragProg distribution?

C:\TEMP\x>ruby -v
ruby 1.8.0 (2003-08-04) [i386-mswin32]

You have to know
what encoding you’re starting with - ISO-8859-1 (a.k.a Latin-1) or
whatever your local character set is. Note the order of arguments
in Iconv.open: the target character set comes first, then the source
character set.

irb(main):001:0> require ‘iconv’
=> true

Not for me:

irb(main):001:0> require 'iconv'
LoadError: No such file to load -- iconv
     	from (irb):1:in `require'
        from (irb):1
irb(main):002:0> VERSION
=> "1.8.0"

Trying your code in a file named ‘tryconv.rb’:

require 'iconv'
convertor = Iconv.open("UTF-8", "ISO-8859-1")
strUTF = convertor.iconv("test")

I get:

C:\TEMP\x>ruby tryconv.rb
./iconv.rb:2: uninitialized constant Iconv (NameError)
     	from tryconv.rb:1:in `require'
        from tryconv.rb:1

Cheers, Giuliano

···


If you want to send me an email in the address you have to write ‘p’,
then a dot, followed by ‘bossi’ at ‘quinary’, another dot and ‘com’ at last

Piergiuliano Bossi wrote:

Mark J. Reed wrote:

Use the iconv library, which comes with Ruby 1.8.

Is iconv available on Windows 1.8.0 PragProg distribution?

C:\TEMP\x>ruby -v
ruby 1.8.0 (2003-08-04) [i386-mswin32]

You have to know
what encoding you’re starting with - ISO-8859-1 (a.k.a Latin-1) or
whatever your local character set is. Note the order of arguments
in Iconv.open: the target character set comes first, then the source
character set.

irb(main):001:0> require 'iconv'
=> true

Not for me:

irb(main):001:0> require 'iconv'
LoadError: No such file to load -- iconv
        from (irb):1:in `require'
        from (irb):1
irb(main):002:0> VERSION
=> "1.8.0"

Trying your code in a file named ‘tryconv.rb’:

require 'iconv'
convertor = Iconv.open("UTF-8", "ISO-8859-1")
strUTF = convertor.iconv("test")

I get:

C:\TEMP\x>ruby tryconv.rb
./iconv.rb:2: uninitialized constant Iconv (NameError)
        from tryconv.rb:1:in `require'
        from tryconv.rb:1

yes, i reported this recently. it’s a pretty bad problem for me :O/
i hope somebody smarter than me will fix it…

emmanuel

Emmanuel Touzery wrote:

Piergiuliano Bossi wrote:

Is iconv available on Windows 1.8.0 PragProg distribution?

[…]

yes, i reported this recently. it’s a pretty bad problem for me :O/
i hope somebody smarter than me will fix it…

You are right, I’m sorry, I should have checked first.

Not a big issue for me, anyway, I’m just curious to understand how it
can work on Windows.

Cheers, Giuliano

···


If you want to send me an email in the address you have to write ‘p’,
then a dot, followed by ‘bossi’ at ‘quinary’, another dot and ‘com’ at last

Piergiuliano Bossi wrote:

Is iconv available on Windows 1.8.0 PragProg distribution?

[…]

yes, i reported this recently. it’s a pretty bad problem for me :O/
i hope somebody smarter than me will fix it…

You are right, I’m sorry, I should have checked first.

Not a big issue for me, anyway, I’m just curious to understand how it
can work on Windows.

Cheers, Giuliano

I would like to point out that a port of this feature to windows is
possible, there is a native windows iconv port:

unfortunately I think right now ruby does not support iconv at all on win32.

emmanuel

Hi,

unfortunately I think right now ruby does not support iconv at all on win32.

It’s supported, if you have it installed and pass proper
options to configure.

···

At Mon, 22 Dec 2003 22:21:08 +0900, Emmanuel Touzery wrote:

set configure_args=–with-iconv-dir=where_you_installed_it
nmake


Nobu Nakada

great :O)
i hope ruby-pragprog 1.8.1 will have it by default…

emmanuel

···

nobu.nokada@softhome.net wrote:

Hi,

At Mon, 22 Dec 2003 22:21:08 +0900, >Emmanuel Touzery wrote:

unfortunately I think right now ruby does not support iconv at all on win32.

It’s supported, if you have it installed and pass proper
options to configure.