OS = Windows XP
File size = 24MB
Total Keywords: 51
Ruby version = ruby 1.7.2 (2002-07-02) [i386-mswin32]
Python = ActivePython-2.2.1-222
PYTHON SCRIPT:
import the needed libs
import sys, string, re, time
make sure the command line arguments are there
if len(sys.argv) < 3:
sys.exit(“usage: fread.py [log file] [hit file]”)
start the timer
start = time.time()
open the files with some error checking
try:
inFile = open(sys.argv[1],“r”)
except IOError:
sys.exit(“Cannot open log file!”)
try:
outFile = open(sys.argv[2],“w”)
except IOError:
sys.exit(“Cannot open hits file!”)
build list of keywords
I took my list out since the words are bad ones
kw = [ “keywords”, “here” ]
loop through the list and print the lines to a file
for line in inFile.xreadlines():
for badword in kw:
if line.find(badword) > -1:
found = ‘%s %s’ % (badword, line)
#print found # Print the result
outFile.write(found) # Write the result
close the files
inFile.close()
outFile.close()
let me know when it’s done
#puts "Finished processing file…"
print “Python took:%6.2f seconds” % (time.time()-start)
RUBY SCRIPT:
populate the array
keywords deleted to protect innocent eyes
kw = [ “keywords”, “here” ]
Ruby provides micro-seconds in Time object
Time.now.usec
start = Time.now
open the files
inFile = File.open(ARGV[0], “r”)
outFile = File.new(ARGV[1], “w+”)
search for a word and print the entire line
inFile.each_line { |line|
for word in kw
if line.include?(word)
#print line
outFile.write "#{word} #{line}"
end
end
}
close the files
inFile.close
outFile.close
elapsed = Time.now - stat
visual clue script is done
#puts "Finished processing file…"
puts “Ruby took #{elapsed} seconds.”
Both were started from scratch one time.
Ruby: 83.938 seconds
Python: 18.52 seconds
I do NOT know if Python’s xreadline is doing the same as Ruby’s
each_line in my script. If my each_line is not equivalent…please let
me know so I can run it again.
Bob