From UTF-8 to windows-1252

Hello.

I have some data in a file with windows-1252 charset ("special"
characters, for example accented words). I use the method encode to post
them in a SQLite3 DB:

   mydata.encode("utf-8")

Using SQLiteSpy I can see the data with the right characters.

But when I get the data from the DB with my program I want to process
them in Windows-1252 again. So, if I use encode with windows-1252 I get
an error

   mydata.encode("windows-1252")

   compare_synonyms.rb:21:in `encode': "\xC3" from ASCII-8BIT to UTF-8
in conversion from ASCII-8BIT to Windows-1252
(Encoding::UndefinedConversionError) from compare_synonyms.rb:21:in
`block (2 levels) in identify_synonyms'

   Now, if I use codepoints the data are not displayed with the the
right characters:

   mydata.codepoints.to_a.pack("C*")
   >> acompañar

What happen? What can I do?

Thanks in advanced.

···

--
Posted via http://www.ruby-forum.com/.

Hello,

But when I get the data from the DB with my program I want to process
them in Windows-1252 again. So, if I use encode with windows-1252 I get
an error

mydata.encode("windows-1252")

Although an encoding of the data from the DB is UTF-8, ruby doesn't
know the encoding, so you must do tell ruby the encoding before
encoding to Windows-1252.

  # tell ruby the encoding
  mydata.force_encoding( "UTF-8" )
  # encode to windows-1252
  mydata.encode( "windows-1252" )

Regards,

···

--
nobuoka

Although an encoding of the data from the DB is UTF-8, ruby doesn't
know the encoding, so you must do tell ruby the encoding before
encoding to Windows-1252.

  # tell ruby the encoding
  mydata.force_encoding( "UTF-8" )
  # encode to windows-1252
  mydata.encode( "windows-1252" )

Regards,

Hey, thanks a lot!! Now I can see the right characters =D

Regards.

···

--
Posted via http://www.ruby-forum.com/\.

Hi,

···

In message "Re: From UTF-8 to windows-1252" on Fri, 7 Jan 2011 03:53:26 +0900, "Y. NOBUOKA" <nobuoka@r-definition.com> writes:

Although an encoding of the data from the DB is UTF-8, ruby doesn't
know the encoding, so you must do tell ruby the encoding before
encoding to Windows-1252.

# tell ruby the encoding
mydata.force_encoding( "UTF-8" )
# encode to windows-1252
mydata.encode( "windows-1252" )

For the record, you don't have to use force_encoding:

  mydata.encode("windows-1252", "UTF-8")

              matz.

Hi, matz

>Although an encoding of the data from the DB is UTF-8, ruby doesn't
>know the encoding, so you must do tell ruby the encoding before
>encoding to Windows-1252.
>
> # tell ruby the encoding
> mydata.force_encoding( "UTF-8" )
> # encode to windows-1252
> mydata.encode( "windows-1252" )

For the record, you don't have to use force_encoding:

mydata.encode("windows-1252", "UTF-8")

I missed the +src_encoding+ arg and the +option+ arg.
Now I see a String#encode method is a very useful.
http://www.ruby-doc.org/core/classes/String.html#M001113

thanks!

···

--
nobuoka