Inconsistent IO character reading when converting encoding

In Ruby 1.9.3-429, I am trying to parse plain text files with various
encodings that will ultimately be converted to UTF-8 strings. Non-ascii
characters work fine with a file encoded as UTF-8, but problems come up
with non-UTF-8 files.

Simplified example:

File.open(file) do |io|
  io.set_encoding("#{charset.upcase}:#{Encoding::UTF_8}")
  line, char = "", nil

  until io.eof? || char == ?\n || char == ?\r
    char = io.readchar
    puts "Character #{char} has #{char.each_codepoint.count} codepoints"
    puts "SLICE FAIL" unless char == char.slice(0,1)

    line << char
  end
  line
end

Both files are just a single string áÁð encoded appropriately. I have
checked that the files have been encoded correctly via "$ file -i
<file_name>"

With a UTF-8 file, I get back:
Character á has 1 codepoints
Character Á has 1 codepoints
Character ð has 1 codepoints

With an ISO-8859-1 file:
Character á has 2 codepoints
SLICE FAIL
Character Á has 2 codepoints
SLICE FAIL
Character ð has 2 codepoints
SLICE FAIL

The way I am interpreting this is readchar is returning an incorrectly
converted encoding which is causing slice to return incorrectly.

Is this behavior correct? Or am I specifying the file external encoding
incorrectly? I would rather not rewrite this process so I am hoping I am
making a mistake somewhere. There are reasons why I am parsing files
this way, but I don't think those are relevant to my question.
Specifying the internal and external encoding as an option in File.open
yielded the same results.

···

--
Posted via http://www.ruby-forum.com/.