Speeding up

Hi, Rubyists.

I’ve been playing with some cross-language comparison in preparation for a
talk I
will be giving at work on ruby. I was playing with a simple word count program
and made the following (see below). It performs pretty well compared with
Java,
which kind of pleasantly surprised me ;-).

My question is this: 30% of the time is spent in Array#each. Is there any
way to speed this up and still keep some semblance of readability?

I was also interested to find that the line count was one out. I used RFC1000
as input and the last line has ^L-EOF. I ignore these chars in the C and Java
versions.

Regards,
-mark.

------------------------------
#! /usr/bin/ruby

···

if ARGV.size != 1
puts "usage: #{$0} "
exit(1)
end

f = File.readlines(ARGV[0])

nl = f.length
nw = nc = 0
f.each { |line|
nw += line.split.size
nc += line.length
}
puts " #{nl} #{nw} #{nc} #{ARGV[0]}"

-------------------------

$ time ruby wc.rb …/rfc.txt
8642 40156 315315 …/rfc.txt

real 0m0.450s user 0m0.030s sys 0m0.010s

$ time wc1.exe …/rfc.txt
8641 40305 315315 …/rfc.txt

real 0m0.141s user 0m0.020s sys 0m0.020s

$ time java wc …/rfc.txt
8641 40156 323956 …/rfc.txt

real 0m0.420s user 0m0.010s sys 0m0.040s

“Mark Probert” probertm@nortelnetworks.com wrote in message
news:5.1.0.14.2.20021204171209.020808e8@zcard04k.ca.nortel.com

What are You measuring Ruby’s speed for?
Are You serious intending use of Ruby for such tasks as processing images in
real time?
I see Ruby being a perfect language to build sophisticated programs of
simple but quick blocks of native code.

Mark Probert probertm@nortelnetworks.com writes:

My question is this: 30% of the time is spent in Array#each. Is there any
way to speed this up and still keep some semblance of readability?

Don’t use Arrary#each.

#! /usr/bin/ruby

···

if ARGV.size != 1
puts “usage: #{$0} ”
exit(1)
end

f = File.open(ARGV[0]){|fh| fh.read}

nl = f.count(“\n”)
nw = f.tr_s("^\t\n\v\f\r ", “x”).count(“x”)
nc = f.size

puts " #{nl} #{nw} #{nc} #{ARGV[0]}"


eban

Mark Probert probertm@nortelnetworks.com writes:

f = File.readlines(ARGV[0])

I think if you use #sysread, it will improve the time too.

jenny:~> ruby tmp.rb rfc2571.txt
user system total real
using readlines 5.460000 0.140000 5.600000 ( 5.720842)
using sysread 4.720000 0.180000 4.900000 ( 5.006272)

The time difference is small since most of the time is spent in
manipulating Strings and Array. However, with larger files, the
difference will become obvious, especially in simple operations like
file copying, etc.

YS.

tmp.rb (855 Bytes)

Hi, Alexi.

···

At 07:33 PM 12/5/2002 +0900, you wrote:

What are You measuring Ruby’s speed for?

Fun. I find that Ruby is “fast enough” for almost all of my work.
My question was to find out how to speed up one little section that
“profile” indicated was the major time consumer.

Regards,

-mark.

Hi, eban.

···

At 08:14 PM 12/5/2002 +0900, you wrote:

Mark Probert probertm@nortelnetworks.com writes:

My question is this: 30% of the time is spent in Array#each. Is there any
way to speed this up and still keep some semblance of readability?

Don’t use Arrary#each.

f = File.open(ARGV[0]){|fh| fh.read}

Interesting.

The first run was pretty much the same for both version. The
next runs were much faster using read. I guess that I have some
file caching going on. (I tested under Win2k using 1.7.2)

Regards,

-mark.