I have a text file with phrases that I'm looking to split into chunks.
The following keyword list:
the brown fox jumped,
over the fence,
Which should produce the following output:
the,
the brown,
the brown fox,
the brown fox jumped,
brown fox,
brown fox jumped,
fox,
fox jumped,
jumped,
over,
over the,
over the fence,
the,
the fence,
fence
I'm currently using the following code which splits after each space:
def count_frequency
the_file='D:/Ruby/projects/data.txt'
h = Hash.new
f = File.open(the_file, "r")
f.each_line { |line|
words = line.split
words.each { |w|
if h.has_key?(w)
h[w] = h[w] + 1
else
h[w] = 1
end
}
}
# sort the hash by value, and then print it in this sorted order
h.sort{|a,b| a[1]<=>b[1]}.each { |elem|
puts "\"#{elem[0]}\" has #{elem[1]} occurrences"
}
end
By the look of this I just need to append to the words array more words
with a different slice?
You're right. I'm still a beginner to Ruby, however I have still tried
researching what I'm looking for and come up with no results. I tried
manipulating the starting code but did not return any related results so
I asked a question... surely thats what these forums are for! (I'm sure
you would have done something like this - learning by example when you
first started too!)
Although I don't appreciate your tone and communication skills (perhaps
you need a lesson) thank you for your technical help.
Ok, taking on board all what has been said so far... this is what I'm
hoping to achieve (short term help needed as I am a beginner) is take a
list of strings from a text file and run through each string and split
it in as many combinations as possible, then count all the occurences of
each new strings that are split and provide them in the console as an
output.
To incorporate the occurence count for each keyword do we need to put it
into a hash similar to the first example I gave or is it possible to
directly link that up with the output?
The previous example I had was:
words.each { |w|
w.lstrip
if h.has_key?(w)
h[w] = h[w] + 1
else
h[w] = 1
end
}
}
# sort the hash by value, and then print it in this sorted order
h.sort{|a,b| a[1]<=>b[1]}.each { |elem|
puts "\"#{elem[0]}\" has #{elem[1]} occurrences"
}
0.upto limit do |start|
start.upto limit do |stop|
puts phrase[start..stop].join(' ')
end
end
counts.sort_by {|w,c| -c}.each do |w,c|
printf "%6d %s\n", c, w
end
end
end
Would the counter hash with the key go underneath the 'puts' in the loop
so that it records each step? At the minute it still just outputs the
new strings without the ordering.
You're right. I'm still a beginner to Ruby, however I have still tried
researching what I'm looking for and come up with no results. I tried
manipulating the starting code but did not return any related results so
I asked a question... surely thats what these forums are for! (I'm sure
you would have done something like this - learning by example when you
first started too!)
Although I don't appreciate your tone and communication skills (perhaps
you need a lesson) thank you for your technical help.
Ryan, since you brought up communication skills: from your original
posting it is not entirely clear to me what you want to do. Do you want
to count word occurrences? Do you want to generate permutations of all
subsets of words found in a document? Or do you want to generate all
sub sequences of each phrase (line) in the document?
A few remarks: the usual counting idiom is this
counters = Hash.new 0
...
counters[key] += 1
If you need to append to Array per key, you can do
Ok, taking on board all what has been said so far... this is what I'm
hoping to achieve (short term help needed as I am a beginner) is take a
list of strings from a text file
Check File#read (either "ri File#read" on the command line, or on
ruby-doc.org). The gist:
data = File.read "myfile"
and run through each string and split it in as many combinations as possible
There's a lot of splitting possible!
Though, I guess you want to split a sentence into its words, correct?
Either way, String#split is what you want (probably 'A string".split("
")', which splits the string at spaces).
, then count all the occurences of each new strings that are split and provide them in the console as an
output.
Well, once you split your string, you get an Array of chunks (or
tokens, if you prefer): ["A", "string"]. So the question is: Do you
want to get every possible combination, or a subset of these
combinations (as in the example provided in your OP)?
···
On Wed, Jul 20, 2011 at 8:16 PM, Ryan Mckenzie <ryan@souliss.com> wrote:
To incorporate the occurence count for each keyword do we need to put it
into a hash similar to the first example I gave or is it possible to
directly link that up with the output?
Please see what I called "counting idiom" above.
# sort the hash by value, and then print it in this sorted order
h.sort{|a,b| a[1]<=>b[1]}.each { |elem|
puts "\"#{elem[0]}\" has #{elem[1]} occurrences"
}
To print in descending order you can as well do
counts.sort_by {|w,c| -c}.each do |w,c|
printf "%6d %s\n", c, w
end
0.upto limit do |start|
start.upto limit do |stop|
puts phrase[start..stop].join(' ')
end
end
counts.sort_by {|w,c| -c}.each do |w,c|
printf "%6d %s\n", c, w
end
end
end
Would the counter hash with the key go underneath the 'puts' in the loop
so that it records each step? At the minute it still just outputs the
new strings without the ordering.
At the moment I would be surprised to see any output from counts because
you never update it. You also do not close the file properly (you could
make your life easier by using File.foreach) and I believe there is also
a spelling error ("if.each"). Did this program actually run and work?
Btw, the_file should rather be a method argument IMHO.
Did you really post this through the forum? Interestingly there I
cannot see your sentence "As a beginner...". How weird is that? Does
the forum -> mailing list gateway add content?
Kind regards
robert
···
On Wed, Jul 20, 2011 at 12:18 PM, 7stud -- <bbxx789_05ss@yahoo.com> wrote:
Ryan Mckenzie wrote in post #1011820:
You're right. I'm still a beginner to Ruby, however I have still tried
researching what I'm looking for and come up with no results.
As a beginner to ruby programming, you should be writing all programs
from scratch--not trying to alter some program you found on the
internets.
Sorry that was a typo with the 'if'. I changed that to 'f'.
Unfortunately no I could not get it working. I will update the file so
it closes as you mentioned. How would I go about intergrating the count
with the phrase[start...stop] to insert those into the hash.
Altering programs found online is a fantastic way to learn. Thank god for open source/free software. Of course it depends on your goals, but i would not limit learning to just writing everything from scratch.
Lake
···
On Jul 20, 2011, at 10:52 AM, Phillip Gawlowski <cmdjackryan@gmail.com> wrote:
On Wed, Jul 20, 2011 at 12:18 PM, 7stud -- <bbxx789_05ss@yahoo.com> wrote:
As a beginner to ruby programming, you should be writing all programs
from scratch--not trying to alter some program you found on the
internets.
Unfortunately no I could not get it working. I will update the file so
it closes as you mentioned. How would I go about intergrating the count
with the phrase[start...stop] to insert those into the hash.
Ah yes I see what you mean now. Ok so now I'm left with trying to use
the hash key to assign the phrases. Do I need to incrument the h and k
each time or just one of them?