How to send utf8 data to remote computer in ruby 1.9.2

Alexey_Petrushin · 17 August 2011 14:13

Hi, could you please point to the exact problem you mean related to Ruby 1.9

encoding?

1. There is no way to set default source encoding (supplying
command-line option is very inconvenient way to do it).

2. So, You has to put this extra line to source files # encoding: utf-8
Actually, I even encounter an article when someone created rake task
that reads all files in lib and prepend this line, and he's called it
the 'progress'. Well if this is the measure of the "progress" then the
Java with his bloated Code Generation tools should be scored as far more
"progressive" than Ruby.
In my point it's regress, not the progress.

2. There are problems with saving utf-8 characters in YAML

3. Why at all should I ever bother about encoding at these days? The
only possible justification may be - the performance improvements, but
Ruby is anyway slow, with or without any encoding optimization.

According to TIOBE the Java user base, installation base and community
are about 30 times bigger than Ruby, and there are only one available
encoding - utf-8, and it's absolutely enough to cover all the possible
use cases.

In 99.9% of cases You use UTF and You are happy! So, why Ruby that
positioned as simple and beautiful language have such messy situation
with encoding, compared to "bloated Java"?

···

--
Posted via http://www.ruby-forum.com/\.

11142 · 17 August 2011 15:41

I think the reason why Ruby has all these encodings, instead of simply
two: "UTF-8 string" and "binary data", is that there are some problems
(data losses or something) when converting some obscure Japanese
encodings to UTF and back. And since Ruby has strong Japanese
connections, developers cared about this enough to create the
situation we have now.

(Disclaimer: I may be wrong, I just read that somewhere.)

-- Matma Rex

···

2011/8/17 Alexey Petrushin <axyd80@gmail.com>:

Hi, could you please point to the exact problem you mean related to Ruby 1.9

encoding?

1. There is no way to set default source encoding (supplying
command-line option is very inconvenient way to do it).

2. So, You has to put this extra line to source files # encoding: utf-8
Actually, I even encounter an article when someone created rake task
that reads all files in lib and prepend this line, and he's called it
the 'progress'. Well if this is the measure of the "progress" then the
Java with his bloated Code Generation tools should be scored as far more
"progressive" than Ruby.
In my point it's regress, not the progress.

2. There are problems with saving utf-8 characters in YAML

3. Why at all should I ever bother about encoding at these days? The
only possible justification may be - the performance improvements, but
Ruby is anyway slow, with or without any encoding optimization.

According to TIOBE the Java user base, installation base and community
are about 30 times bigger than Ruby, and there are only one available
encoding - utf-8, and it's absolutely enough to cover all the possible
use cases.

In 99.9% of cases You use UTF and You are happy! So, why Ruby that
positioned as simple and beautiful language have such messy situation
with encoding, compared to "bloated Java"?

--
Posted via http://www.ruby-forum.com/\.

Brabuhr · 17 August 2011 20:02

In 99.9% of cases You use UTF and You are happy! So, why Ruby that
positioned as simple and beautiful language have such messy situation
with encoding, compared to "bloated Java"?

Another good read on the subject:

"Ruby multilingualization (M17N) of Ruby 1.9 uses the code set
independent model (CSI) while many other languages use the Unicode
normalization model."

"Under the CSI model, all encodings are handled equally, which means,
Unicode is one of character sets. The most remarkable feature of the
CSI model is that the model does not require a character code
conversion since external and internal character codes are identical.
Thus, the cost for conversion can be eliminated. Besides, we can keep
away from unexpected information loss caused by the conversion,
especially by cutting bits or bytes off. Ruby uses the CSI model, so
do Solaris, Citrus, or other system based on the C library that does
not use __STDC_ISO_10646__."

"Moreover, it is possible to handle various character sets even though
they are not based on Unicode."

Topic		Replies	Views
How to send utf8 data to remote computer in ruby 1.9.2 ruby-talk	1	137	18 August 2011
How to send utf8 data to remote computer in ruby 1.9.2 ruby-talk	1	141	17 August 2011
How to send utf8 data to remote computer in ruby 1.9.2 ruby-talk	3	137	18 August 2011
Ruby 1.9 - US-ASCII vs UTF-8 ruby-talk	2	151	19 December 2009
To_yaml in utf-8 encoding ruby-talk	7	147	10 April 2011

How to send utf8 data to remote computer in ruby 1.9.2

Related topics