Management of words in a string

Hi All.

I'm trying to make a program in which you must enter a string and
calculate the number of words entered.

The problem is that you deal with whole words in a string, only handle
characters or letters.

As I can implement the above?

Thanks.

···

--
Posted via http://www.ruby-forum.com/.

You can use String#split method. You have to define very well what is
a word for you. For example, consider things like "one-way street" or
"it's raining", and also be careful with punctuation. A simplistic
approach could be just to use the default split behaviour, which
splits by the spaces:

s = "this has words. how many? let's see"
[5] pry(main)> s.split
=> ["this", "has", "words.", "how", "many?", "let's", "see"]
[6] pry(main)> s.split.size
=> 7

You can pass a regular expression to the split method to tune how you split.

Jesus.

···

On Fri, Jul 6, 2012 at 5:52 AM, Joao Silva <lists@ruby-forum.com> wrote:

Hi All.

I'm trying to make a program in which you must enter a string and
calculate the number of words entered.

The problem is that you deal with whole words in a string, only handle
characters or letters.

As I can implement the above?

Hi,

Joao Silva wrote in post #1067618:

As I can implement the above?

For large text you may use String#scan, which has the advantage of not
collecting all words in an array like String#split does:

input_text = 'This is a sentence.'
word_count = input_text.strip.scan(/\s+/).size + 1

But like Jesus already said, this simple approach will not always work.
If the "words" in your text may contain whitespace, then looking for
whitespace will obviously fail. You'll have to use a dictionary in this
case. This would also cover errors (missing or superfluous whitespace).

···

--
Posted via http://www.ruby-forum.com/\.

str.scan(/a\w+/).size

···

--
Posted via http://www.ruby-forum.com/.

yeah sorry i was dump.

str = "bag of bananas and one apple"
str.scan(/\Wa\w+/).size
=> 2

···

--
Posted via http://www.ruby-forum.com/.

hm still wrong, the best thing i could do is this:

str = "a bag of bananas and one apple"
str.scan(Regexp.union(/^a\w*/,/\Wa\w*/))
=> ["a", " and", " apple"]

···

--
Posted via http://www.ruby-forum.com/.

Hi,

Joao Silva wrote in post #1067618:

As I can implement the above?

For large text you may use String#scan, which has the advantage of not
collecting all words in an array like String#split does:

word_count = 0
input_text.scan(/\w+/){ word_count += 1}

input_text = 'This is a sentence.'
word_count = input_text.strip.scan(/\s+/).size + 1

I don't think this usage of #scan is a good approach, because it will
yield totally wrong results:

irb(main):002:0> input_text = '. : & #'
=> ". : & #"
irb(main):003:0> input_text.strip.scan(/\s+/).size + 1
=> 4

Whereas positive matching sequences of word characters is much closer
to the reality:

irb(main):004:0> input_text.scan(/\w+/).size
=> 0

But like Jesus already said, this simple approach will not always work.
If the "words" in your text may contain whitespace, then looking for
whitespace will obviously fail. You'll have to use a dictionary in this
case. This would also cover errors (missing or superfluous whitespace).

It's crucial to clarify the definition of "word", I agree.

Kind regards

robert

···

On Fri, Jul 6, 2012 at 10:06 AM, Jan E. <lists@ruby-forum.com> wrote:

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

"Jesús Gabriel y Galán" <jgabrielygalan@gmail.com> wrote in post
#1067632:

You can use String#split method. You have to define very well what is
a word for you. For example, consider things like "one-way street" or
"it's raining", and also be careful with punctuation. A simplistic
approach could be just to use the default split behaviour, which
splits by the spaces:

s = "this has words. how many? let's see"
[5] pry(main)> s.split
=> ["this", "has", "words.", "how", "many?", "let's", "see"]
[6] pry(main)> s.split.size
=> 7

You can pass a regular expression to the split method to tune how you
split.

Jesus.

and in case you want to count the words that begin with a particular
letter (for example "a").

···

##############################################
ct=0

print "Enter a string: "
str=gets.chomp.to_s

puts "Word ==> #{str.split}"

if str.chr == "a"
  ct=ct+1
end

puts "Number of words that start with a: #{ct}"

#################################################

--
Posted via http://www.ruby-forum.com/\.

Hans Mackowiak wrote in post #1067740:

str.scan(/a\w+/).size

Clearly wrong.

str = "bag of bananas"
str.scan(/a\w+/).size
=> 2

···

--
Posted via http://www.ruby-forum.com/\.

Still wrong, sorry Hans :frowning:

str = "apple and banana"
str.scan(/\Wa\w+/).size
=> 1

A correct regex would be (I hope I don't get it wrong now) /\ba\B/.

-- Matma Rex

Hans Mackowiak wrote in post #1067783:

hm still wrong, the best thing i could do is this:

Try \ba\w*

(\b = word boundary)

···

--
Posted via http://www.ruby-forum.com/\.