How to send utf8 data to remote computer in ruby 1.9.2

Hi, could you please point to the exact problem you mean related to Ruby 1.9

encoding?

1. There is no way to set default source encoding (supplying
command-line option is very inconvenient way to do it).

2. So, You has to put this extra line to source files # encoding: utf-8
Actually, I even encounter an article when someone created rake task
that reads all files in lib and prepend this line, and he's called it
the 'progress'. Well if this is the measure of the "progress" then the
Java with his bloated Code Generation tools should be scored as far more
"progressive" than Ruby.
In my point it's regress, not the progress.

2. There are problems with saving utf-8 characters in YAML

3. Why at all should I ever bother about encoding at these days? The
only possible justification may be - the performance improvements, but
Ruby is anyway slow, with or without any encoding optimization.

According to TIOBE the Java user base, installation base and community
are about 30 times bigger than Ruby, and there are only one available
encoding - utf-8, and it's absolutely enough to cover all the possible
use cases.

In 99.9% of cases You use UTF and You are happy! So, why Ruby that
positioned as simple and beautiful language have such messy situation
with encoding, compared to "bloated Java"?

···

--
Posted via http://www.ruby-forum.com/\.

I think the reason why Ruby has all these encodings, instead of simply
two: "UTF-8 string" and "binary data", is that there are some problems
(data losses or something) when converting some obscure Japanese
encodings to UTF and back. And since Ruby has strong Japanese
connections, developers cared about this enough to create the
situation we have now.

(Disclaimer: I may be wrong, I just read that somewhere.)

-- Matma Rex

···

2011/8/17 Alexey Petrushin <axyd80@gmail.com>:

Hi, could you please point to the exact problem you mean related to Ruby 1.9

encoding?

1. There is no way to set default source encoding (supplying
command-line option is very inconvenient way to do it).

2. So, You has to put this extra line to source files # encoding: utf-8
Actually, I even encounter an article when someone created rake task
that reads all files in lib and prepend this line, and he's called it
the 'progress'. Well if this is the measure of the "progress" then the
Java with his bloated Code Generation tools should be scored as far more
"progressive" than Ruby.
In my point it's regress, not the progress.

2. There are problems with saving utf-8 characters in YAML

3. Why at all should I ever bother about encoding at these days? The
only possible justification may be - the performance improvements, but
Ruby is anyway slow, with or without any encoding optimization.

According to TIOBE the Java user base, installation base and community
are about 30 times bigger than Ruby, and there are only one available
encoding - utf-8, and it's absolutely enough to cover all the possible
use cases.

In 99.9% of cases You use UTF and You are happy! So, why Ruby that
positioned as simple and beautiful language have such messy situation
with encoding, compared to "bloated Java"?

--
Posted via http://www.ruby-forum.com/\.

In 99.9% of cases You use UTF and You are happy! So, why Ruby that
positioned as simple and beautiful language have such messy situation
with encoding, compared to "bloated Java"?

Another good read on the subject:

"Ruby multilingualization (M17N) of Ruby 1.9 uses the code set
independent model (CSI) while many other languages use the Unicode
normalization model."

"Under the CSI model, all encodings are handled equally, which means,
Unicode is one of character sets. The most remarkable feature of the
CSI model is that the model does not require a character code
conversion since external and internal character codes are identical.
Thus, the cost for conversion can be eliminated. Besides, we can keep
away from unexpected information loss caused by the conversion,
especially by cutting bits or bytes off. Ruby uses the CSI model, so
do Solaris, Citrus, or other system based on the C library that does
not use __STDC_ISO_10646__."

"Moreover, it is possible to handle various character sets even though
they are not based on Unicode."