The original text in the file contains characters that I do not to
include to my final result, that should only contain ASCII 65..90 and
97..122. So I do not understand, what arguments should be given to gsub?
The original text in the file contains characters that I do not to
include to my final result, that should only contain ASCII 65..90 and
97..122. So I do not understand, what arguments should be given to gsub?
PS--I left the numericals and all kinds of punctuational marks in there,
just in case if you have them in the original file--though there are
certainly not within your original range of ASCII 65..90 and 97..122
Your first argument to gsub appears to be ASCII 197.
Yes, You're correct, but still I do not know how to fix my code...As the
source text contains chars not among 65..90 and 97..122, how I can
remove or replace them?
PS--I left the numericals and all kinds of punctuational marks in there,
just in case if you have them in the original file--though there are
certainly not within your original range of ASCII 65..90 and 97..122
Your first argument to gsub appears to be ASCII 197.
Yes, You're correct, but still I do not know how to fix my code...As the
source text contains chars not among 65..90 and 97..122, how I can
remove or replace them?
Strings in ruby 1.9 are complicated beasts. I had a go at understanding
them:
So it really depends on what you're trying to do. If you want to
manipulate this file as a series of bytes, and match particular bytes,
then open it in binary mode ('rb'), and pass only binary strings to
gsub.
temp.gsub!("xxx".force_encoding("BINARY"), "")
The trouble with opening the file as UTF-8, and doing regexp matches
with UTF-8 characters, is that your program will crash when fed invalid
UTF-8 data. So it is not good for "data cleaning" exercises.
But strangely, ruby 1.9 is quite happy to deal with invalid strings in
some contexts. For example, if you do
temp.size.times do |i|
puts temp[i]
end
then it will work even if the i'th character is invalid. Go figure.
Da: Luca (Email) [mailto:luca.pagano@email.it]
Inviato: giovedì 29 dicembre 2011 07:58
A: ruby-talk ML
Oggetto: I: Argument error --- How to solve?
--
Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP autenticato? GRATIS solo con Email.ithttp://www.email.it/f
Sponsor:
Riccione Hotel 3 stelle in centro: Pacchetto Capodanno mezza pensione, animazione bimbi, zona relax, parcheggio. Scopri l'offerta solo per oggi...
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid983&d)-12