How to handle Iconv errors (newbie question)

Hi,

I have an array of arrays of strings, e.g.,
  [["string"],["string"],["string"]]

and I want to
    (1) convert each string to a different encoding, and
    (2) print the newly encoded strings.

So far I have been trying

···

,----

myArray.flatten.each do |x|
   puts Iconv.new(to, from).iconv(x)
end

`----

where 'to' and 'from' are encodings. This works most of the time, but
the strings are taken from html files that my script is fetching from
the web. Sometimes they contain characters which cause Iconv to fail
with

,----

Iconv::IllegalSequence: "..."

`----

It's not crucial for my script to print every string in the array, so
I'd like to just ignore any that Iconv can't handle. How can I handle
the Iconv error so that the processing of the array can continue?

Regards,

Matt

Pehaps the best solution is just to tell iconv to skip over illegal
sequences in the source string. To do this, append "//IGNORE" to the
+to+ string.

E.g.

ill = "foo\776\776bar" # Illegal utf-8 string
Iconv.iconv('ISO-8859-15', 'UTF-8', ill) # Raises Iconv::IllegalSequence
Iconv.iconv('ISO-8859-15//IGNORE', 'UTF-8', ill) # => "foobar"

Paul.

···

On Apr 5, 2005 10:25 AM, Matthew Huggett <mhuggett@zam.att.ne.jp> wrote:

It's not crucial for my script to print every string in the array, so
I'd like to just ignore any that Iconv can't handle. How can I handle
the Iconv error so that the processing of the array can continue?

It's not crucial for my script to print every string in the array, so
I'd like to just ignore any that Iconv can't handle. How can I handle
the Iconv error so that the processing of the array can continue?

Pehaps the best solution is just to tell iconv to skip over illegal
sequences in the source string. To do this, append "//IGNORE" to the
+to+ string.

E.g.

ill = "foo\776\776bar" # Illegal utf-8 string
Iconv.iconv('ISO-8859-15', 'UTF-8', ill) # Raises Iconv::IllegalSequence
Iconv.iconv('ISO-8859-15//IGNORE', 'UTF-8', ill) # => "foobar"

That did it. Thanks a lot!

Matt

···

From: Paul Battley <pbattley@gmail.com>