Libxml utf-8 locale

Hello,

honestly if I was to select two things I hate about computers, they should be XML and UTF-8.

I have an xml file:

"<?xml version='1.0' encoding='UTF-8'?>...."

I have installed libxml-ruby and liblocale-ruby on my etch debian.

I have tried:
export LANG=hu_HU.UTF-8
kate sample.xml
it opens the file correctly.

I have tried
export LANG=hu_HU.UTF-8
my_script.rb sample.xml

It cannot deal with the UTF chars. I also have tried insert this line into my script (with require 'locale' of course):
Locale.setlocale(Locale::LC_ALL, 'hu_HU.UTF-8')

No effect.

My script is similar to the one in the docs:

        require 'xml/libxml'
        doc = XML::Document.file('output.xml')
        root = doc.root

        puts "Root element name: #{root.name}"

        elem3 = root.find('elem3').to_a.first
        puts "Elem3: #{elem3['attr']}"

        doc.find('//root_node/foo/bar').each do |node|
          puts "Node path: #{node.path} \t Contents: #{node}"
        end

(I am not using this but something like that with setlocale.)

The output is filled with:
Kínál

What to do now?

Mage

Have you tried putting
$KCODE=u
at the top of your script? (possibly before any requires.)

···

On Mar 7, 2006, at 2:43 PM, Mage wrote:

Hello,

honestly if I was to select two things I hate about computers, they should be XML and UTF-8.

I have an xml file:

"<?xml version='1.0' encoding='UTF-8'?>...."

I have installed libxml-ruby and liblocale-ruby on my etch debian.

I have tried:
export LANG=hu_HU.UTF-8
kate sample.xml
it opens the file correctly.

I have tried
export LANG=hu_HU.UTF-8
my_script.rb sample.xml

It cannot deal with the UTF chars. I also have tried insert this line into my script (with require 'locale' of course):
Locale.setlocale(Locale::LC_ALL, 'hu_HU.UTF-8')

No effect.

My script is similar to the one in the docs:

       require 'xml/libxml'
       doc = XML::Document.file('output.xml')
       root = doc.root

       puts "Root element name: #{root.name}"

       elem3 = root.find('elem3').to_a.first
       puts "Elem3: #{elem3['attr']}"

       doc.find('//root_node/foo/bar').each do |node|
         puts "Node path: #{node.path} \t Contents: #{node}"
       end

(I am not using this but something like that with setlocale.)

The output is filled with:
Kà nál

What to do now?

Mage

Logan Capaldo wrote:

Hello,

honestly if I was to select two things I hate about computers, they should be XML and UTF-8.

I have an xml file:

"<?xml version='1.0' encoding='UTF-8'?>...."

I have installed libxml-ruby and liblocale-ruby on my etch debian.

I have tried:
export LANG=hu_HU.UTF-8
kate sample.xml
it opens the file correctly.

I have tried
export LANG=hu_HU.UTF-8
my_script.rb sample.xml

It cannot deal with the UTF chars. I also have tried insert this line into my script (with require 'locale' of course):
Locale.setlocale(Locale::LC_ALL, 'hu_HU.UTF-8')

No effect.

My script is similar to the one in the docs:

require 'xml/libxml'
doc = XML::Document.file('output.xml')
root = doc.root

puts "Root element name: #{root.name}"

elem3 = root.find('elem3').to_a.first
puts "Elem3: #{elem3['attr']}"

doc.find('//root_node/foo/bar').each do |node|
puts "Node path: #{node.path} \t Contents: #{node}"
end

(I am not using this but something like that with setlocale.)

The output is filled with:
Kà nál

What to do now?

Mage

Have you tried putting
$KCODE=u
at the top of your script? (possibly before any requires.)

Didn't help.

Now I am using iconv converter for some nodes, but I think it's a nasty way.

Mage

···

On Mar 7, 2006, at 2:43 PM, Mage wrote: