I grap some info from a webpage. Sometimes I get some stranges
characters as follows (by p):
To depart in a hurry; abscond: \342\200\234Your horse
has\nabsquatulated!\342\200\235 (Robert M. Bird) To die.
or (by print):
To depart in a hurry; abscond: “Your horse has absquatulated!â€
(Robert M. Bird) To die.
I grap some info from a webpage. Sometimes I get some stranges
characters as follows (by p):
To depart in a hurry; abscond: \342\200\234Your horse
has\nabsquatulated!\342\200\235 (Robert M. Bird) To die.
or (by print):
To depart in a hurry; abscond: “Your horse has absquatulated!â€
(Robert M. Bird) To die.
Any idea to to get rid of them?
Those are multi-byte characters (curly quotes, in this case). You
probably don't want to get rid of them, but you can use the iconv
library to transliterate them back to their ASCII almost-equivalents:
string = "To depart in a hurry; abscond: \342\200\234Your horse has\nabsquatulated!\342\200\235 (Robert M. Bird) To die."
=> "To depart in a hurry; abscond: \342\200\234Your horse
has\nabsquatulated!\342\200\235 (Robert M. Bird) To die."
I grap some info from a webpage. Sometimes I get some stranges
characters as follows (by p):
To depart in a hurry; abscond: \342\200\234Your horse
has\nabsquatulated!\342\200\235 (Robert M. Bird) To die.
Here's a quick hack I used recently. It was messing my display on
ncurses, and I did not need the characters.
dataitem.gsub!(/[^[:space:][:print:]]/,'')
I got this while googling, iirc, its used somewhere in ROR.
Those are multi-byte characters (curly quotes, in this case). You
probably don't want to get rid of them, but you can use the iconv
library to transliterate them back to their ASCII almost-equivalents:
string = "To depart in a hurry; abscond: \342\200\234Your horse has\nabsquatulated!\342\200\235 (Robert M. Bird) To die."
=> "To depart in a hurry; abscond: \342\200\234Your horse
has\nabsquatulated!\342\200\235 (Robert M. Bird) To die."
There is no one-click installer for 1.9 on Windows as far as I can tell. Downloading and unpacking the ziped binaries didn't get me very far as both ruby and irb complain that something is missing. Does binary distribution require me to install anything else? Like libraries? If this is the case what additional stuff do I need to make 1.9 to work and where can I get it?
class String
def remove_nonascii(replacement)
n=self.split("")
self.slice!(0..self.size)
n.each{|b|
if (b[0].to_i< 32 || b[0].to_i>124) then
self.concat(replacement)
elsif
[34,35,37,42,43,44,45,47,60,61,62,63,91,92,93,94,96,123].include?(b[0].to_i)
self.concat(replacement)
else
self.concat(b)
end
}
self.to_s
end
end
"Fatal injury or ruin:\223Hath some fond lover tic'd thee to
thybane?\224\342\200\246".remove_nonascii('+')
=> "Fatal injury or ruin:+Hath some fond lover tic'd thee to thybane+++++"
how you can see, it made the replacement with char '+'.