Transform non-english text

Hello,
I have a Ruby aplication that deals with non-english text and I want to
transform some of that text to [^a-zA-Z0-9].
Examples:
búsqueda -> busqueda
presenças -> presencas
für -> fur
avião1 -> aviao1

Call anyone help me?

Thanks.
Best regards,
Migrate

···

--
Posted via http://www.ruby-forum.com/.

I don't think there is a unified mapping table to transform non-[^a-zA-Z0-9]
characters into a specific one of them. But if you can concider to write a
map yourself try something like:

class String
MAP = [[/ü/, 'u'],
        [/ö/, 'o']]

def eng_char
   res = String.new(self)
   MAP.each { |r| res = res.gsub(r[0],r[1]) }
   return res
end

end

s = "abücüöö"
puts s + " => " + s.eng_char

···

----------
Will output:

abücüöö => abucuoo

Martin

On Friday 19 January 2007 09:19, Hu Ma wrote:

Hello,
I have a Ruby aplication that deals with non-english text and I want to
transform some of that text to [^a-zA-Z0-9].
Examples:
búsqueda -> busqueda
presenças -> presencas
für -> fur
avião1 -> aviao1

Call anyone help me?

Thanks.
Best regards,
Migrate

Hello,
I have a Ruby aplication that deals with non-english text and I want to
transform some of that text to [^a-zA-Z0-9].

You could try with Iconv to convert from your encoding to ASCII. Quick
example :

require "iconv"

=> true

Iconv.iconv("ascii//translit", "iso-8859-1", "aéioù")

=> ["a'eio`u"]

Iconv.iconv("ascii//translit", "iso-8859-1", "aéiou")[0].tr('^a-z', '')

=> "aeiou"

Fred

···

Le 19 janvier 2007 à 10:19, Hu Ma a écrit :
--
Can you see your days blighted by darkness ?
Is it true you beat your fists on the floor ?
Stuck in a world of isolation
While the ivy grows over the door (Pink Floyd, Lost For Words)

Hello,

Thanks for your help.

I will try both approaches to see what fits best.

Best regards,
Migrate

Martin Boese wrote:

···

I don't think there is a unified mapping table to transform
non-[^a-zA-Z0-9]
characters into a specific one of them. But if you can concider to write
a
map yourself try something like:

class String
MAP = [[/ü/, 'u'],
        [/ö/, 'o']]

def eng_char
   res = String.new(self)
   MAP.each { |r| res = res.gsub(r[0],r[1]) }
   return res
end

end

s = "abücüöö"
puts s + " => " + s.eng_char

----------
Will output:

abücüöö => abucuoo

Martin

--
Posted via http://www.ruby-forum.com/\.

F. Senault wrote:

require "iconv"

=> true

Iconv.iconv("ascii//translit", "iso-8859-1", "aéioù")

=> ["a'eio`u"]

Iconv.iconv("ascii//translit", "iso-8859-1", "aéiou")[0].tr('^a-z', '')

=> "aeiou"

iconv translit is really nice... when it works. It works on our FreeBSD
server but not on my ubuntu dev machine. Your mileage may vary.

Daniel