UTF-8 encoding

#1

Hi,

Im working with xmprpc/client and the server Im working against
requires utf8 for my strings.

I found this article with google:
http://redhanded.hobix.com/inspect/futurismUnicodeInRuby.html

Is this for real? Is there no workaround for me to get my data to the
server with uft8 encoding?

Best,
martin

(Dave Burt) #2

balony@gmail.com wrote:

Im working with xmprpc/client and the server Im working against
requires utf8 for my strings.

I found this article with google:
http://redhanded.hobix.com/inspect/futurismUnicodeInRuby.html

Is this for real? Is there no workaround for me to get my data to the
server with uft8 encoding?

You can use iconv or you can run an external pipe through GNU recode.

utf8_string = Iconv.new('iso-8859-1', 'utf-8').iconv(iso_8859_string)

(Bertram Scharpf) #3

Hi,

···

Am Freitag, 12. Aug 2005, 03:11:11 +0900 schrieb balony@gmail.com:

Is this for real? Is there no workaround for me to get my data to the
server with uft8 encoding?

I convert utf16 to utf8 using the workaround below. Having a
short look at Wikipedia, e.g., should lead to a similar
solution easily.

Bertram

------------------------------

def utf16to8 utf16
  endian = if utf16.slice! /\A\xff\xfe/ then
    "v"
  elsif utf16.slice! /\A\xfe\xff/ then
    "n"
  else
    "S"
  end
  utf8 = ""
  utf16.scan /../ do |bb|
    b, = bb.unpack endian
    if b < 0x80 then
      utf8 << b
    elsif b < 0x800 then
      utf8 << (0b11000000 | (b >> 6 & 0b11111))
      utf8 << (0b10000000 | (b & 0b111111))
    elsif b < 0x10000 then
      utf8 << (0b11100000 | (b >> 12 & 0b1111))
      utf8 << (0b10000000 | (b >> 6 & 0b111111))
      utf8 << (0b10000000 | (b & 0b111111))
    else
      utf8 << (0b11110000 | (b >> 18 & 0b111))
      utf8 << (0b10000000 | (b >> 12 & 0b111111))
      utf8 << (0b10000000 | (b >> 6 & 0b111111))
      utf8 << (0b10000000 | (b & 0b111111))
    end
  end
  utf8
end

--
Bertram Scharpf
Stuttgart, Deutschland/Germany
http://www.bertram-scharpf.de

#4

Dave Burt skrev:

···

balony@gmail.com wrote:
>
> Im working with xmprpc/client and the server Im working against
> requires utf8 for my strings.
>
> I found this article with google:
> http://redhanded.hobix.com/inspect/futurismUnicodeInRuby.html
>
> Is this for real? Is there no workaround for me to get my data to the
> server with uft8 encoding?

You can use iconv or you can run an external pipe through GNU recode.

utf8_string = Iconv.new('iso-8859-1', 'utf-8').iconv(iso_8859_string)

I'm running on a windows system, that could be a problem?
Also i'm getting "test.rb:4: uninitialized constant Iconv (NameError)"
when testing.

--
Martin

(Jeff Mital) #5

* balony@gmail.com <balony@gmail.com>:

Dave Burt skrev:

[...]

> Is this for real? Is there no workaround for me to get my data to the
> server with uft8 encoding?

You can use iconv or you can run an external pipe through GNU recode.

utf8_string = Iconv.new('iso-8859-1', 'utf-8').iconv(iso_8859_string)

I'm running on a windows system, that could be a problem? Also
i'm getting "test.rb:4: uninitialized constant Iconv (NameError)"
when testing.

You'll need to put the line:

require 'iconv'

at the top of your program to use the Iconv.* functions. I don't
have access to a Windows box, but if it's part of Ruby's standard
library, I'd assume it's available on Windows.

    - jeff -

···

balony@gmail.com wrote:

--
jeff mital
jmital@aracnet.com
Unix is simple. It just takes a genius to understand its simplicity.
        - Dennis Ritchie

(Park Heesob) #6

Hi,

···

----- Original Message -----
From: <balony@gmail.com>
Newsgroups: comp.lang.ruby
To: "ruby-talk ML" <ruby-talk@ruby-lang.org>
Sent: Friday, August 12, 2005 6:31 AM
Subject: Re: UTF-8 encoding

Dave Burt skrev:

balony@gmail.com wrote:
>
> Im working with xmprpc/client and the server Im working against
> requires utf8 for my strings.
>
> I found this article with google:
> http://redhanded.hobix.com/inspect/futurismUnicodeInRuby.html
>
> Is this for real? Is there no workaround for me to get my data to the
> server with uft8 encoding?

You can use iconv or you can run an external pipe through GNU recode.

utf8_string = Iconv.new('iso-8859-1', 'utf-8').iconv(iso_8859_string)

I'm running on a windows system, that could be a problem?
Also i'm getting "test.rb:4: uninitialized constant Iconv (NameError)"
when testing.

--
Martin

Try this:

      def encode(str)
        begin
            require 'Win32API'
            str += "\0"
            ostr = "\0" * 256
            multiByteToWideChar =
Win32API.new('kernel32','MultiByteToWideChar',['L','L','P','L','P','L'],'L')
            multiByteToWideChar.Call(0,0,str,-1,ostr,128)
            (ostr.strip + "\0").unpack("S*").pack("U*")
        rescue LoadError
            require 'iconv'
            Iconv::iconv('iso-8859-1','UTF-8',str)[0]
        end
      end

utf8_string = encode(iso_8859_string)

Regards,

Park Heesob

#7

utf8_string = Iconv.new('iso-8859-1', 'utf-8').iconv(iso_8859_string)

I'm running on a windows system, that could be a problem? Also
i'm getting "test.rb:4: uninitialized constant Iconv (NameError)"
when testing.

You'll need to put the line:

require 'iconv'

at the top of your program to use the Iconv.* functions. I don't
have access to a Windows box, but if it's part of Ruby's standard
library, I'd assume it's available on Windows.

The One-Click Installer still doesn't include Iconv (the developers are
tracking it as an open issue).

The Rails wiki has some info on how to get it going:


(See the Windows section.)
To summarize, the Ruby-mswin32 distribution at
http://www.garbagecollect.jp/ruby/mswin32/en/download/release.html
contains the required files.
You can also get the required iconv.dll from gettext on SourceForge:
http://sourceforge.net/project/showfiles.php?group_id=25167&package_id=51458

Cheers,
Dave