My script showing Python speed vs. Ruby (long, includes code)

OS = Windows XP
File size = 24MB
Total Keywords: 51

Ruby version = ruby 1.7.2 (2002-07-02) [i386-mswin32]
Python = ActivePython-2.2.1-222

PYTHON SCRIPT:

import the needed libs

import sys, string, re, time

make sure the command line arguments are there

if len(sys.argv) < 3:
sys.exit(“usage: fread.py [log file] [hit file]”)

start the timer

start = time.time()

open the files with some error checking

try:
inFile = open(sys.argv[1],“r”)
except IOError:
sys.exit(“Cannot open log file!”)

try:
outFile = open(sys.argv[2],“w”)
except IOError:
sys.exit(“Cannot open hits file!”)

build list of keywords

I took my list out since the words are bad ones

kw = [ “keywords”, “here” ]

loop through the list and print the lines to a file

for line in inFile.xreadlines():
for badword in kw:
if line.find(badword) > -1:
found = ‘%s %s’ % (badword, line)
#print found # Print the result
outFile.write(found) # Write the result

close the files

inFile.close()
outFile.close()

let me know when it’s done

#puts "Finished processing file…"
print “Python took:%6.2f seconds” % (time.time()-start)

RUBY SCRIPT:

populate the array

keywords deleted to protect innocent eyes

kw = [ “keywords”, “here” ]

Ruby provides micro-seconds in Time object

Time.now.usec
start = Time.now

open the files

inFile = File.open(ARGV[0], “r”)
outFile = File.new(ARGV[1], “w+”)

search for a word and print the entire line

inFile.each_line { |line|
for word in kw
if line.include?(word)
#print line
outFile.write "#{word} #{line}"
end
end
}

close the files

inFile.close
outFile.close
elapsed = Time.now - stat

visual clue script is done

#puts "Finished processing file…"
puts “Ruby took #{elapsed} seconds.”

Both were started from scratch one time.

Ruby: 83.938 seconds
Python: 18.52 seconds

I do NOT know if Python’s xreadline is doing the same as Ruby’s
each_line in my script. If my each_line is not equivalent…please let
me know so I can run it again.

Bob

Bob wrote:

Ruby: 83.938 seconds
Python: 18.52 seconds

It seems that String#include is not implemented efficiently
in Ruby (In fact it is a simple loop of memcmps).

As I replace the list of strings by a list of regular
expressions (not by a single one, but a list of 6 keywords), then
I got on my Linux box (70MB logfile):

Python 2.2 (#1, Mar 26 2002, 15:46:04)
ruby 1.7.3 (2002-09-23) [i686-linux]
ruby 1.6.7 (2002-03-01) [i686-linux]

Python took: 28.18 seconds
Ruby 1.7.3 took 19.199376 seconds.
Ruby 1.6.7 took 23.12174 seconds.

the result file is:
wc z
177948 1398284 14462965 z

I used the same keywords and the output files were identical.
I did not alter the Python program, so it did not use regular
expressions.

Regards, Christian

Hello Bob,

Tuesday, September 24, 2002, 4:00:25 PM, you wrote:

Both were started from scratch one time.

Ruby: 83.938 seconds
Python: 18.52 seconds

hmmm. try to run scripts in reverse order

···

--
Best regards,
Bulat mailto:bulatz@integ.ru

It seems that String#include is not implemented efficiently
in Ruby (In fact it is a simple loop of memcmps).

Situations like those are why I always check my programs through profile.rb
;0)

···


_/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/
Bruce Williams http://www.codedbliss.com
iusris/#ruby-lang bruce@codedbliss.com
_/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/

Bulat Ziganshin wrote:

Hello Bob,

Tuesday, September 24, 2002, 4:00:25 PM, you wrote:

Both were started from scratch one time.

Ruby: 83.938 seconds
Python: 18.52 seconds

hmmm. try to run scripts in reverse order

There may be cache effects, but I could reproduce a 2x
runtime difference between Ruby and Python consequently
in several consecutive runs in permuted order. The problem
is really string#include.

Regards, Christian

“Bruce Williams” bruce@codedbliss.com wrote in message
news:200209241935.43877.bruce@codedbliss.com

Situations like those are why I always check my programs through
profile.rb
;0)


_/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/
Bruce Williams http://www.codedbliss.com
iusris/#ruby-lang bruce@codedbliss.com
_/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/

C:\Ruby\lib\ruby\1.7>profile.rb d:\Robert\Projects\Ruby\fread.rb
% cumulative self self total
time seconds seconds calls ms/call ms/call name
0.00 0.00 0.00 1 0.00 10.00 #toplevel

:slight_smile:

Bob X wrote:

C:\Ruby\lib\ruby\1.7>profile.rb d:\Robert\Projects\Ruby\fread.rb
% cumulative self self total
time seconds seconds calls ms/call ms/call name
0.00 0.00 0.00 1 0.00 10.00 #toplevel

:slight_smile:

FYI: I also have a ruby compiled with the -pg (profile support flag),
which is sometimes more helpful (but not on big programs).

Regards, Christian

I’ve just gotten into the habit of using -rprofile

···

On Wednesday 25 September 2002 05:02 am, Christian Szegedy wrote:

FYI: I also have a ruby compiled with the -pg (profile support flag),
which is sometimes more helpful (but not on big programs).

Regards, Christian


_/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/
Bruce Williams http://www.codedbliss.com
iusris/#ruby-lang bruce@codedbliss.com
_/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/ _/