I have following tab delimited ucs-2le input file ucs2le_sample.txt:
Job Title Second case
"EVP; COO, DIGITAL TV" "ESPTA TÉCNICO ""B"""
there are 2 columns (Job Title, Second case)
internally in Ruby:
irb(main):001:0> File.open("ucs2le_sample.txt","r").each do |line|
irb(main):002:1* p line
irb(main):003:1> end
"\377\376J\000o\000b\000 \000T\000i\000t\000l\000e\000\t\000S\000e\000c
\000o\000n\000d\000 \000c\000a\000s\000e\000\r\000\n"
"\000\"\000E\000V\000P\000;\000 \000C\000O\000O\000,\000 \000D\000I
\000G\000I\000T\000A\000L\000 \000T\000V\000\"\000\t\000\"\000E\000S
\000P\000T\000A\000 \000T\000\311\000C\000N\000I\000C\000O\000
\000\"\000\"\000B\000\"\000\"\000\"\000\r\000\n"
"\000"
=> #<File:ucs2le_sample.txt>
I am trying to convert this file into UTF8 encoding with following
conv.rb script:
require 'iconv'
File.open("utf8.txt","wb") do |f|
File.open("ucs2le_sample.txt","rb").each do |line|
ic = Iconv.new("UTF-8","UCS-2LE")
f.write(ic.iconv(line))
end
end
when I run it I end up with error message:
chris@chris-ub:~/staging/ruby/csv2db$ ruby conv.rb
conv.rb:6:in `iconv': "\n" (Iconv::InvalidCharacter)
from conv.rb:6
from conv.rb:4:in `each'
from conv.rb:4
from conv.rb:3:in `open'
from conv.rb:3
How can I modify the little script to convert the ucs-2le file into
utf-8?
thanks,
chris