I am trying to understand how encoding interacts with system calls,
particularly on Windows.
I suspect there may be a bug in ruby on how it invokes system calls with
non-ASCII characters in the command.
I have a demonstration program below (also attached, in case the
encoding gets corrupted in the mailing list). It fails on Windows 7 and
Windows XP, even when I "chcp 65001". I have my Command Prompt
configured to use Lucinda Console font (as default Raster fonts have
even more problems with non-ASCII characters).
# encoding: utf-8
def test(word)
returned = `echo #{word}`.chomp
puts "#{word} == #{returned}"
raise "Cannot roundtrip #{word}" unless word == returned
end
test "good"
test "bÃd"
puts "Success"
# win7, cmd.exe font set to Lucinda Console, chcp 65001
# good == good
# bÃd == bÃd
I suspect that something in here
needs to make sure the string being passed to the cmd.exe process is
encoded properly, but I'm in over my head.
I also suspect that this response hold a clue about the change that may
need to be made:
I'd like to know if
1) Is this a bug in ruby on Windows
or
2) How can I change the program above so that it completes successfully
on Windows?
Have you tried something like `chcp 65001; echo #{word}`?
···
On Tuesday, 22 October 2013 г. at 5:31, Joshua F. wrote:
I am trying to understand how encoding interacts with system calls,
particularly on Windows.
I suspect there may be a bug in ruby on how it invokes system calls with
non-ASCII characters in the command.
I have a demonstration program below (also attached, in case the
encoding gets corrupted in the mailing list). It fails on Windows 7 and
Windows XP, even when I "chcp 65001". I have my Command Prompt
configured to use Lucinda Console font (as default Raster fonts have
even more problems with non-ASCII characters).
# encoding: utf-8
def test(word)
returned = `echo #{word}`.chomp
puts "#{word} == #{returned}"
raise "Cannot roundtrip #{word}" unless word == returned
end
test "good"
test "bÃd"
puts "Success"
# win7, cmd.exe font set to Lucinda Console, chcp 65001
# good == good
# bÃd == bÃd