How to match and count

Jean_G · 22 November 2009 02:09

Hello,

I have been using awk for some text handling.
Now I'm beginning Ruby (really newbie) and want to find a way in ruby
to do this with awk:

awk '{if ($4~/something/) {i+=1}} END {print i}' file.txt

That means if a line's 4th field match "something" then increase the
counter by 1.
How to write the corresponding ruby code?

Thanks in advance.

Gavin_Kistner3 · 22 November 2009 02:45

What is a 'field'? Whitespace delimited?

···

On Nov 21, 7:09 pm, Ruby Newbee <rubynew...@gmail.com> wrote:

awk '{if ($4~/something/) {i+=1}} END {print i}' file.txt

That means if a line's 4th field match "something" then increase the
counter by 1.
How to write the corresponding ruby code?

Jean_G · 22 November 2009 03:00

Yes, thanks.

···

2009/11/22 Phrogz <phrogz@mac.com>:

On Nov 21, 7:09 pm, Ruby Newbee <rubynew...@gmail.com> wrote:

awk '{if ($4~/something/) {i+=1}} END {print i}' file.txt

That means if a line's 4th field match "something" then increase the
counter by 1.
How to write the corresponding ruby code?

What is a 'field'? Whitespace delimited?

Gavin_Kistner3 · 22 November 2009 04:46

Here are two ways:

# Don't read the whole file into memory, but do it one line at a time
i = 0
file = File.open( "foo.txt" )
file.each_line do |line|
pieces = line.split( /\s+/ )
i += 1 if pieces[ 3 ] =~ /something/
end

# Just read the whole file at once, assuming it's small enough,
# and create an array of the fourth column's
col = File.read("foo.txt").scan(/.+/).map{ |line| line.scan(/\S+/)
[3] }
i = col.count{ |val| val =~ /something/ }

···

On Nov 21, 8:00 pm, Ruby Newbee <rubynew...@gmail.com> wrote:

2009/11/22 Phrogz <phr...@mac.com>:
> On Nov 21, 7:09 pm, Ruby Newbee <rubynew...@gmail.com> wrote:
>> awk '{if ($4~/something/) {i+=1}} END {print i}' file.txt
>> That means if a line's 4th field match "something" then increase the
>> counter by 1.
>> How to write the corresponding ruby code?
> What is a 'field'? Whitespace delimited?
Yes, thanks.

Jean_G · 22 November 2009 09:12

I like that, thank you!

···

2009/11/22 Phrogz <phrogz@mac.com>:

# Don't read the whole file into memory, but do it one line at a time
i = 0
file = File.open( "foo.txt" )
file.each_line do |line|
pieces = line.split( /\s+/ )
i += 1 if pieces[ 3 ] =~ /something/
end

Robert_K1 · 22 November 2009 14:52

# Don't read the whole file into memory, but do it one line at a time
i = 0
file = File.open( "foo.txt" )
file.each_line do |line|
pieces = line.split( /\s+/ )
i += 1 if pieces[ 3 ] =~ /something/
end

I like that, thank you!

The code above does not close the file handle properly. Also if can be done shorter:

File.foreach "file.txt" do |line|
...
end

You can even use Ruby like awk which seems to be rarely done - but it's possible.

awk '{if ($4~/something/) {i+=1}} END {print i}' file.txt

Can be done like

ruby -nae 'BEGIN {$i=0}; $i+=1 if /something/ =~ $F[3]; END {puts $i}' file.txt
ruby -nae 'BEGIN {$i=0}; /something/ =~ $F[3] and $i+=1; END {puts $i}' file.txt

For a script, I'd probably do something similar to what Phrogz suggested but with the difference that I'd use ARGF. That way you fetch file names from the command line and do not need to change the script if the file name changes:

i = 0

ARGF.each do |line|
bit = line.split(/\s+/)[3]
i += 1 if /something/ =~ bit
end

puts i

Or, do the matching in one step which seems more efficient

i = 0

ARGF.each do |line|
i += 1 if /^\s*(?:\S+\s+){3}something/ =~ line
end

puts i

There are about 2,843 million other ways to do it in Ruby.

Kind regards

robert

···

On 22.11.2009 10:12, Ruby Newbee wrote:

2009/11/22 Phrogz <phrogz@mac.com>:

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Jean_G · 23 November 2009 05:32

oops this is the same way as Perl's.

I may adjust one point:

should be:
$F[3] =~ /something/;

not:
/something/ =~ $F[3];

After that replace "ruby" to "perl" on above commands and that will be
working too.

Thanks~

···

2009/11/22 Robert Klemme <shortcutter@googlemail.com>:

ruby -nae 'BEGIN {$i=0}; $i+=1 if /something/ =~ $F[3]; END {puts $i}'
file.txt
ruby -nae 'BEGIN {$i=0}; /something/ =~ $F[3] and $i+=1; END {puts $i}'
file.txt

Robert_K1 · 23 November 2009 08:33

ruby -nae 'BEGIN {$i=0}; $i+=1 if /something/ =~ $F[3]; END {puts $i}'
file.txt
ruby -nae 'BEGIN {$i=0}; /something/ =~ $F[3] and $i+=1; END {puts $i}'
file.txt

oops this is the same way as Perl's.

I may adjust one point:

should be:
$F[3] =~ /something/;

not:
/something/ =~ $F[3];

Why?

After that replace "ruby" to "perl" on above commands and that will be
working too.

Ah, you want a single program to work both for Perl and Ruby. Thanks
for sharing!

I usually prefer to have the regular expression as the first argument
to =~ because for me that seems more natural (the regexp is doing the
matching) and IIRC it is a tad faster (but really only a tad).

Kind regards

robert

···

2009/11/23 Ruby Newbee <rubynewbee@gmail.com>:

2009/11/22 Robert Klemme <shortcutter@googlemail.com>:

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Topic		Replies	Views
Matching lines ruby-talk	5	84	16 December 2006
Ruby translation for UNIX scripting command ruby-talk	5	121	6 March 2004
Novice question, tranlate awk code to ruby code ruby-talk	7	105	11 March 2004
Awk regexp search ruby-talk	10	74	16 July 2007
Printing from a file - beginner ruby-talk	1	124	4 December 2007

How to match and count

Related topics