I tried running the code to remove stop words from text variable. The
code does not remove the words from text variable based on the array
variable stopwords. The output shows that the length of keywords and
words are 11, which is not true.
text = %q{Los Angeles has some of the nicest weather in the country.};
stopwords = %w{the a by on for of are with just but and to the my in I
has some};
words = text.scan(/\W+/);
keywords = words.select { |word| !stopwords.include?(word) };
puts "#{words.length} words";
puts "#{keywords.length} keywords";
You do not need all the semi colons at line ends (that is a Perlism
not needed in Ruby).
A slight modification of your script reveals what's going on
$ ruby19 sw.rb
11 words
11 keywords
[" ", " ", " ", " ", " ", " ", " ", " ", " ", " ", "."]
[" ", " ", " ", " ", " ", " ", " ", " ", " ", " ", "."]
$ cat sw.rb
text = %q{Los Angeles has some of the nicest weather in the country.}
stopwords = %w{the a by on for of are with just but and to the my in I
has some}
words = text.scan(/\W+/)
keywords = words.select { |word| !stopwords.include?(word) }
puts "#{words.length} words"
puts "#{keywords.length} keywords"
p words, keywords
$
Hint: it's a case issue.
Cheers
robert
···
On Fri, Dec 16, 2011 at 9:11 AM, Anand Srinivasan <shrianand85@gmail.com> wrote:
I tried running the code to remove stop words from text variable. The
code does not remove the words from text variable based on the array
variable stopwords. The output shows that the length of keywords and
words are 11, which is not true.
text = %q{Los Angeles has some of the nicest weather in the country.};
stopwords = %w{the a by on for of are with just but and to the my in I
has some};
words = text.scan(/\W+/);
keywords = words.select { |word| !stopwords.include?(word) };
puts "#{words.length} words";
puts "#{keywords.length} keywords";