Ruby- puts with accents

Hello Guys!,

I'm beginning with ruby, and I've a problem with a console app. I was
looking since yesterday but nothing worked. Then I write to you for get
help.

Below I paste my code and the results. I running in Win7 and ruby 192

#! /usr/bin/ruby
# encoding: UTF-8

require 'iconv'
require 'rubygems'
require "mysql"
require 'nokogiri'
require './lib/mihttp.rb'

# encoding: utf-8
`chcp 852` #change cmd encoding to unicode
puts 'será test '

utf8 = "áóíúé"
puts utf8

latin1 = utf8.encode("iso-8859-1")
puts latin1

exit

Also, I tried with gems: iconv, magic_encoding...; add lines
Encoding.default_internal and external. But I get same results.. I don't
know what's wrong!

C:\Work\RubysApps>ruby hello.rb
será test
├í├│├ş├║├ę
ߡÝ˙Ú

C:\Work\RubysApps>ruby hello.rb
hello.rb:1: invalid multibyte char (US-ASCII)

C:\Work\RubysApps>ruby -v
ruby 1.9.2p290 (2011-07-09) [i386-mingw32]

Thanks
M
PS: Sorry by my horrible english!

···

--
Posted via http://www.ruby-forum.com/.

This runs in a sub process, so I'm guessing it only affects that process, not the parent process. What happens if you run this yourself before the ruby script?

···

On May 18, 2012, at 11:31, "Mariano José G." <lists@ruby-forum.com> wrote:

# encoding: utf-8
`chcp 852` #change cmd encoding to unicode

Codepage 852 isn't the Unicode codepage - it's MSDOS Latin-2 which
isn't even ISO 8859-2 - see
http://en.wikipedia.org/wiki/Code_page_852\.

According to Code Page Identifiers - Win32 apps | Microsoft Learn
the codepage for UTF-8 is 65001.

You should be able to set the codepage with

  chcp 65001

then just output your UTF-8 strings without having to convert them as
long as you have the

  # encoding: utf-8

line near the top of your script.

I'm afraid I can't test this at the moment as I don't have access to a
Windows machine.

Regards,
Sean

···

On Fri, May 18, 2012 at 7:31 PM, Mariano José G. <lists@ruby-forum.com> wrote:

# encoding: utf-8
`chcp 852` #change cmd encoding to unicode

You should encode the output in the same encoding as the one used in
console (the "codepage" – that info displayed by `chcp`). So for
example:

# encoding: UTF-8
utf8 = "áóíúé"
puts utf8.encode('cp852')

Obviously console codepages only contain a subset of Unicode. I have
tried to get `chcp 65001` to work with Ruby once, but failed
miserably; unfortunately Windows' console is utterly broken beyond any
repair when it comes to Unicode.

-- Matma Rex

Well, I tried with chcp 852 ; 850 65001, and nothing. Also, I tried
changing file encoding ISO-8859-1 and UTF8

I don't know. But, At home pc, It's running, there too is W7, but the
version of ruby is 193 and the chcp set 850.

···

--
Posted via http://www.ruby-forum.com/.

Sean O'halpin wrote in post #1061359:

# encoding: utf-8
`chcp 852` #change cmd encoding to unicode

Codepage 852 isn't the Unicode codepage - it's MSDOS Latin-2 which
isn't even ISO 8859-2 - see
http://en.wikipedia.org/wiki/Code_page_852\.

According to
Code Page Identifiers - Win32 apps | Microsoft Learn
the codepage for UTF-8 is 65001.

You should be able to set the codepage with

  chcp 65001

then just output your UTF-8 strings without having to convert them as
long as you have the

  # encoding: utf-8

line near the top of your script.

I'm afraid I can't test this at the moment as I don't have access to a
Windows machine.

First, uninstall 1.9.2 and install a recent 1.9.3 (p125 or later) from
http://rubyinstaller.org/ If you're using the 1.9 family on Windows,
purge every other version except 1.9.3p125 or higher.

Here's what I get on Win7 32bit in a cmd.exe shell...

C:\Users\Jon\Documents\RubyDev\sandbox>chcp
Active code page: 437

*** encoding_1.rb file contents ***
# encoding: UTF-8
utf8 = "Some accented text áóíúé with regular text."
puts utf8

C:\Users\Jon\Documents\RubyDev\sandbox>pik ruby encoding_1.rb
jruby 1.6.7.2 (ruby-1.9.2-p312) (2012-05-01 26e08ba) (Java HotSpot(TM)
Client VM 1.7.0_04) [Windows 7-x86-java]

Some accented text áóíúé with regular text.

ruby 1.8.7 (2012-02-08 patchlevel 358) [i386-mingw32]

Some accented text áóíúé with regular text.

ruby 1.9.3p125 (2012-02-16) [i386-mingw32]

Some accented text áóíúé with regular text.

ruby 1.9.3p223 (2012-05-19 revision 35717) [i386-mingw32]

Some accented text áóíúé with regular text.

tcs-ruby 1.9.3p196 (2012-04-21, TCS patched 2012-04-21) [i386-mingw32]

Some accented text áóíúé with regular text.

ruby 2.0.0dev (2012-05-21 trunk 35732) [i386-mingw32]

Some accented text áóíúé with regular text.

...and without the `# encoding: UTF-8` at the top of the file:

C:\Users\Jon\Documents\RubyDev\sandbox>pik ruby encoding_1.rb
jruby 1.6.7.2 (ruby-1.9.2-p312) (2012-05-01 26e08ba) (Java HotSpot(TM)
Client VM 1.7.0_04) [Windows 7-x86-java]

SyntaxError: encoding_1.rb:1: invalid multibyte char (US-ASCII)

ruby 1.8.7 (2012-02-08 patchlevel 358) [i386-mingw32]

Some accented text áóíúé with regular text.

ruby 1.9.3p125 (2012-02-16) [i386-mingw32]

encoding_1.rb:1: invalid multibyte char (US-ASCII)
encoding_1.rb:1: invalid multibyte char (US-ASCII)

ruby 1.9.3p223 (2012-05-19 revision 35717) [i386-mingw32]

encoding_1.rb:1: invalid multibyte char (US-ASCII)
encoding_1.rb:1: invalid multibyte char (US-ASCII)

tcs-ruby 1.9.3p196 (2012-04-21, TCS patched 2012-04-21) [i386-mingw32]

encoding_1.rb:1: invalid multibyte char (US-ASCII)
encoding_1.rb:1: invalid multibyte char (US-ASCII)

ruby 2.0.0dev (2012-05-21 trunk 35732) [i386-mingw32]

encoding_1.rb:1: invalid multibyte char (US-ASCII)
encoding_1.rb:1: invalid multibyte char (US-ASCII)

If you want toy with poor old cmd.exe, try using the `type` (like `cat`)
command to list out `encoding_1.rb` after switching different codepages.

Jon

···

On Fri, May 18, 2012 at 7:31 PM, Mariano Jos G. <lists@ruby-forum.com> > wrote:

--
Posted via http://www.ruby-forum.com/\.