Lionel Bouton wrote the following on 11.09.2007 12:48 :
Lee Jarvis wrote the following on 11.09.2007 12:41 :
Ok i'll try to explain what i mean as well as i can
Lets say i have a hash like this
hash { 'a' => '1' } #just as example, its actually far bigger
and if a user inputs abcdabcd i was it to sub all of the a's with 1's..
As i said, the hash is far larger which is why i can't just do it with
gsub..
Any ideas?
Thanks in advance..
Lee
yourstring.split(//).map{|c| hash[c] || c}.join
Note that if your hash is only used to convert single characters to
single characters, you can use String#tr (or tr!). If you are after
performance, as you must prepare the strings used by String#tr from your
hash, you'll have to bench it to see if it's worth it in your use case
even if String#tr is faster in itself.
If you are processing UTF-8 content, String#tr is probably not safe
(there are libraries out there for fixing this though IIRC), but my
first answer probably is (assuming $KCODE='u'; require 'jcode'...) as
the regexp processing is utf-8 aware, so the String#split should be safe.
Thanks that worked well, And no its not single chars, Which is the only
reason i'm doing it this way..
I have to split on whitespace (/ /) because spliting on characters would
obviously split the text i want to transform, which means it wont match
if the characters are trailing another word, HTML special chars for
example
I'd rather not do the split step, IMHO direct replacement will be faster:
h = {"#126" => "~"}
s.gsub(/&([^;]+);/) {|c| h[c] || "&#{c};"}
Btw, I believe there are standard classes that do this type of
replacement (entities in HTML documents) - maybe it's in CGI.
Kind regards
robert
···
2007/9/11, Lee Jarvis <jarvo88@gmail.com>:
Thanks that worked well, And no its not single chars, Which is the only
reason i'm doing it this way..
I have to split on whitespace (/ /) because spliting on characters would
obviously split the text i want to transform, which means it wont match
if the characters are trailing another word, HTML special chars for
example
Thanks that worked well, And no its not single chars, Which is the only reason i'm doing it this way..
I have to split on whitespace (/ /) because spliting on characters would obviously split the text i want to transform, which means it wont match if the characters are trailing another word, HTML special chars for example
h = {"~" => "~"}
If you're just trying to translate numeric html entities it's easy:
str.gsub(/&#(\d+);/){ [$1.to_i].pack('U') }
If you also want named entities I suggest the htmlentities gems.
If it's for a more general case, how about:
rx = Regexp.new(hash.keys.map{|k|Regexp.escape(k)}.join("|"))
str.gsub(rx){ hash[$&] }