I don't know how to write a file using utf-8 encoding. Can you help
me.
Thanks in advance
Kind regards
···
--
Miquel (a.k.a. Ton)
Linux User #286784
GPG Key : 4D91EF7F
Debian GNU/Linux (Linux Wolverine 2.6.14)
Welcome to the jungle, we got fun and games
Guns n' Roses
______________________________________________
LLama Gratis a cualquier PC del Mundo.
Llamadas a fijos y móviles desde 1 céntimo por minuto. http://es.voice.yahoo.com
I don't know how to write a file using utf-8 encoding. Can you help
me.
utf-8 is simply 8-bit bytes. Save your data like this:
# "data" contain the text data
File.open(file_path,"w") { |f| f.write data }
utf-8 refers to a convention regarding the content of the bytes and how they
are interpreted when read. It isn't something you can specify in a
plain-text file. It can be inferred from the format of the bytes, but that
is an open interpretation.
Well, how are you storing the Unicode characters are you using
internally? If your Unicode string within Ruby is stored as an array
of ints, then
File.open("output_file.utf8") do |fp|
fp.puts(data.pack("U*"))
end
should be sufficient. If you have a Ruby string that uses some other
encoding (e.g. ISO-8859-1), then you must use the iconv library to
convert the string to UTF-8:
require 'iconv'
cd = Iconv.new('utf-8', 'iso-8859-1')
File.open("output_file.utf8") do |fp|
fp.puts(cd.iconv(data))
end
When you do i18n, l10n, and m17n, strings become meaningless unless
they have an attached encoding.
···
On 11/12/06, Miquel Oliete <ktalanet@yahoo.es> wrote:
Hi All
I have a problem (newbie problem).
I don't know how to write a file using utf-8 encoding. Can you help
me.
On 11/12/06, David Vallner <david@vallner.net> wrote:
Paul Lutus wrote:
> It isn't something you can specify in a
> plain-text file.
Byte order mark?
Not meaningful in UTF-8, since it's all a defined series of bytes
(it's always the same order on all platforms).
-austin
Yes, but it can be used as a "this file is UTF-8" marker by convention.
And cause problems in software that doesn't recognize the convention,
for added hilarity.
It's a bad convention, because it adds meaningless bytes to the
beginning of a file. I'm not saying that an unadorned document is
better, but better to do something that has actual meaning than doing
a pointless BOM.
-austin
···
On 11/12/06, David Vallner <david@vallner.net> wrote:
Austin Ziegler wrote:
> On 11/12/06, David Vallner <david@vallner.net> wrote:
>> Paul Lutus wrote:
>> > It isn't something you can specify in a
>> > plain-text file.
>> Byte order mark?
> Not meaningful in UTF-8, since it's all a defined series of bytes
> (it's always the same order on all platforms).
Yes, but it can be used as a "this file is UTF-8" marker by convention.
And cause problems in software that doesn't recognize the convention,
for added hilarity.