I have some data in a file with windows-1252 charset ("special"
characters, for example accented words). I use the method encode to post
them in a SQLite3 DB:
mydata.encode("utf-8")
Using SQLiteSpy I can see the data with the right characters.
But when I get the data from the DB with my program I want to process
them in Windows-1252 again. So, if I use encode with windows-1252 I get
an error
mydata.encode("windows-1252")
compare_synonyms.rb:21:in `encode': "\xC3" from ASCII-8BIT to UTF-8
in conversion from ASCII-8BIT to Windows-1252
(Encoding::UndefinedConversionError) from compare_synonyms.rb:21:in
`block (2 levels) in identify_synonyms'
Now, if I use codepoints the data are not displayed with the the
right characters:
But when I get the data from the DB with my program I want to process
them in Windows-1252 again. So, if I use encode with windows-1252 I get
an error
mydata.encode("windows-1252")
Although an encoding of the data from the DB is UTF-8, ruby doesn't
know the encoding, so you must do tell ruby the encoding before
encoding to Windows-1252.
# tell ruby the encoding
mydata.force_encoding( "UTF-8" )
# encode to windows-1252
mydata.encode( "windows-1252" )
Although an encoding of the data from the DB is UTF-8, ruby doesn't
know the encoding, so you must do tell ruby the encoding before
encoding to Windows-1252.
# tell ruby the encoding
mydata.force_encoding( "UTF-8" )
# encode to windows-1252
mydata.encode( "windows-1252" )
Regards,
Hey, thanks a lot!! Now I can see the right characters =D
In message "Re: From UTF-8 to windows-1252" on Fri, 7 Jan 2011 03:53:26 +0900, "Y. NOBUOKA" <nobuoka@r-definition.com> writes:
Although an encoding of the data from the DB is UTF-8, ruby doesn't
know the encoding, so you must do tell ruby the encoding before
encoding to Windows-1252.
# tell ruby the encoding
mydata.force_encoding( "UTF-8" )
# encode to windows-1252
mydata.encode( "windows-1252" )
For the record, you don't have to use force_encoding:
>Although an encoding of the data from the DB is UTF-8, ruby doesn't
>know the encoding, so you must do tell ruby the encoding before
>encoding to Windows-1252.
>
> # tell ruby the encoding
> mydata.force_encoding( "UTF-8" )
> # encode to windows-1252
> mydata.encode( "windows-1252" )
For the record, you don't have to use force_encoding: