How to make stopwords case insensitive

New at ruby..

I was trying to create a stoplist that is case insensitive. When I run
the code below It includes "In" which I do not want. I was thinking I
could use the .match(/[A-Z,a-z]/) I did use downcase on the string
text, which did work, but I want to leave the string "text" in its
orignal content.

Thanks,

John

text = %q{Los Angeles has some of the nicest weather In the country.}
stopwords = %w{the a by on for of are with just but and to the my in I
has some}

#stopwords = stopwords.match(/[A-Z,a-z]/)

words = text.scan(/\w+/)
keywords = words.select { |word| !stopwords.include?(word) }

puts keywords.join(' ')

···

--
Posted via http://www.ruby-forum.com/.

JW DW wrote:

New at ruby..

I was trying to create a stoplist that is case insensitive. When I run
the code below It includes "In" which I do not want. I was thinking I
could use the .match(/[A-Z,a-z]/) I did use downcase on the string
text, which did work, but I want to leave the string "text" in its
orignal content.

Thanks,

John

text = %q{Los Angeles has some of the nicest weather In the country.}
stopwords = %w{the a by on for of are with just but and to the my in I
has some}

#stopwords = stopwords.match(/[A-Z,a-z]/)

words = text.scan(/\w+/)
keywords = words.select { |word| !stopwords.include?(word) }

puts keywords.join(' ')

You probably figured this out by yourself. Anyway, get the stopwords
array in lowercase, like this:

stopwords.map!{|el| el.downcase}

This gets rid of the disturbing "I" in your stopwords.

"keywords = words.select { |word| !stopwords.include?(word) }"

Almost works. Adjust like this:

keywords = words.select { |word| !stopwords.include?(word.downcase)}

hth,

Siep

···

--
Posted via http://www.ruby-forum.com/\.