Short question about encoding

Hello everybody at the Ruby Forums!

After getting more into Ruby I stumbled across a little problem: how
would I go about making a string with a "square root" sign (√) inside
it, so that it gets encoded in UTF-8 and so that I can later save it in
a file?

Also, I read that Ruby does have encoding-related issues, so, if that's
the case, I can probably live without the "square root" sign.

puts "√".encode("UTF-8") # converts it into a "v"
puts "√" # -//-
puts "√".force_encoding("UTF-8") # -//-

Any ideas, please?

···

--
Posted via http://www.ruby-forum.com/.

One possibility is using the hex codes for that code point, like:

"\xE2\x88\x9A"

There is also a unicode escape, \u. For more information, take a look at:

  http://ruby.runpaint.org/strings#escapes-summary

HTH,
Ammar

···

On Thu, Nov 11, 2010 at 12:01 AM, Gabriel Lichard <poopsmith@lavabit.com> wrote:

Hello everybody at the Ruby Forums!

After getting more into Ruby I stumbled across a little problem: how
would I go about making a string with a "square root" sign (√) inside
it, so that it gets encoded in UTF-8 and so that I can later save it in
a file?

Also, I read that Ruby does have encoding-related issues, so, if that's
the case, I can probably live without the "square root" sign.

puts "√".encode("UTF-8") # converts it into a "v"
puts "√" # -//-
puts "√".force_encoding("UTF-8") # -//-

Any ideas, please?

Are you perhaps sending the output to a device that does not understand UTF-8?

I'm guessing that's the issue.

Although, I'm on Windows 7 and in the Command Prompt I can copy/paste
the "√" character and it appears fine in the Courier New font and
everything, when I run Ruby (Ruby 1.9.2p0) or irb I can't copy/paste
that character anymore. When I save this in a file and run it:

s = "√"
p s
puts s

it outputs:

"\u221A"
√

But then again when I just do:

File.open("test.txt", "w"){|x|x << "√"}

and run it, it makes the test.txt file and saves it without any problems
and with the actual square root character in the file.

Any idea what I'm doing wrong or why it won't appear in the console?

···

--
Posted via http://www.ruby-forum.com/\.

I tried the chcp thing too :confused: :

system "chcp 65001"
s = "√"
p s
puts s

outputs:

Active code page: 65001
"√"
√

I'm still doing something wrong :confused: Any more ideas?

···

--
Posted via http://www.ruby-forum.com/.

Yup I'm on Windows 7.

About PowerShell: I tried the classic PowerShell.exe which printed out
the same weird characters so I tried "type .\test.rb" and that gave an
error so I Googled about PowerShell and Unicode and read that the
PowerShell ISE supports Unicode SO I tried that one: "type .\test.rb"
works fine and displays the character well, but "ruby .\test.rb"
(1.9.2p0) prints the weird characters again.

Any more suggestions, please?

Powershell ISE log:

···

_________________________________________________________________
PS C:\Users\Ye Olde Poopsmith\Desktop> ruby -v
ruby 1.9.2p0 (2010-08-18) [i386-mingw32]

_________________________________________________________________
PS C:\Users\Ye Olde Poopsmith\Desktop> ruby .\test.rb
"\u221A"
#<Encoding:UTF-8>
√

_________________________________________________________________
PS C:\Users\Ye Olde Poopsmith\Desktop> type .\test.rb
#system "chcp 65001"
s = "√"
p s
p s.encoding
puts s.to_s

--
Posted via http://www.ruby-forum.com/.

Hello everybody at the Ruby Forums!

After getting more into Ruby I stumbled across a little problem: how
would I go about making a string with a "square root" sign (√) inside
it, so that it gets encoded in UTF-8 and so that I can later save it in
a file?

Also, I read that Ruby does have encoding-related issues, so, if that's
the case, I can probably live without the "square root" sign.

puts "√".encode("UTF-8") # converts it into a "v"
puts "√" # -//-
puts "√".force_encoding("UTF-8") # -//-

Any ideas, please?

One possibility is using the hex codes for that code point, like:

"\xE2\x88\x9A"

There is also a unicode escape, \u. For more information, take a look at:

http://ruby.runpaint.org/strings#escapes-summary

HTH,
Ammar

It seems to be fine with Ruby 1.9.2-p0.

sqrt = "√"

=> "√"

sqrt.encoding

=> #<Encoding:UTF-8>

sqrt.bytes

=> #<Enumerator: "√":bytes>

sqrt.bytes.to_a

=> [226, 136, 154]

sqrt.chars.to_a

=> ["√"]

puts sqrt


=> nil

puts sqrt.encode("US-ASCII")

Encoding::UndefinedConversionError: U+221A from UTF-8 to US-ASCII
  from (irb):10:in `encode'
  from (irb):10
  from /Users/rab/.rvm/rubies/ruby-1.9.2-p0/bin/irb:17:in `<main>'

puts sqrt.force_encoding("US-ASCII")


=> nil

sqrt.force_encoding("US-ASCII").chars.to_a

=> ["\xE2", "\x88", "\x9A"]

Are you perhaps sending the output to a device that does not understand UTF-8?

-Rob

Rob Biedenharn
Rob@AgileConsultingLLC.com http://AgileConsultingLLC.com/
rab@GaslightSoftware.com http://GaslightSoftware.com/

···

On Nov 10, 2010, at 5:34 PM, Ammar Ali wrote:

On Thu, Nov 11, 2010 at 12:01 AM, Gabriel Lichard <poopsmith@lavabit.com > > wrote:

Because your console doesn't use UTF-8. Windows consoles are 8-bit, and
the 8-bit encoding they use it determined by your code page. UTF-8 is code
page 65001, but its support is sketchy. You can discover your code page by
typing "chcp".

···

Gabriel Lichard <poopsmith@lavabit.com> wrote:

Are you perhaps sending the output to a device that does not understand UTF-8?

I'm guessing that's the issue.
...
But then again when I just do:

File.open("test.txt", "w"){|x|x << "?"}

and run it, it makes the test.txt file and saves it without any problems
and with the actual square root character in the file.

Any idea what I'm doing wrong or why it won't appear in the console?

--
Tim Roberts, timr@probo.com
Providenza & Boekelheide, Inc.

If you have Vista or Win 7 installed, try your script in PowerShell.
Otherwise, install PowerShell, and try your script.

PowerShell is a .NET-based (almost) drop-in replacement for cmd.exe,
and, AFAIK, fully Unicode-aware.

···

On Sat, Nov 13, 2010 at 11:52 AM, Gabriel Lichard <poopsmith@lavabit.com> wrote:

I tried the chcp thing too :confused: :

system "chcp 65001"
s = "√"
p s
puts s

outputs:

Active code page: 65001
"√"
√

I'm still doing something wrong :confused: Any more ideas?

--
Phillip Gawlowski

Though the folk I have met,
(Ah, how soon!) they forget
When I've moved on to some other place,
There may be one or two,
When I've played and passed through,
Who'll remember my song or my face.