Thanks all for speedup tips.
I have tried all of them and the fastest one is attached.
Results:
ruby : 115 sec → 62 sec (wow
python : 60 sec → 53 sec
Ruby speedup is really impressive. Half of the improvement is caused by using
regex to eliminate comment lines and half by different way for extracting second field.
I have also attached a snippet from my web.log file.
Information contained is faked but the structure keeps the same.
This solution seems to be fastest (or one close to the fastest) to get lines having 2nd field
satisfying the condition.
But there could be generally conditions put on more fields. What then?
Use String.split to get the fields and then match single fields or build a all_in_one regex and
try to match the whole line?
<snippet_1>
pom = line.split( “\t” )
if pom[1] =~ /expr_to_match_1/ and pom[3] =~ /expr_to_match_2/ and …
do_something
end
</snippet_1>
OR?
<snippet_2>
if line =~ /expr_to_match_1 … expr_to_match_2 … expr_to_match_3/
do_something
end
</snippet_2>
Thanks
Tom
parse.rb (392 Bytes)
web.log (1.71 KB)
parse.py (572 Bytes)
···
— Joseph McDonald joe@vpop.net wrote:
can you give a few examples of the lines in the logfile?
thanks,
-joeFriday, September 20, 2002, 9:22:34 AM, you wrote:
Hello,
I have written just a simple script to analyze a log file and (just for fun) I have written
exactly the same in python to see the difference and …
python is almost twice the faster doing the same job (???)
You can see attached files for sources.
Environment: P4 1.8Ghz, 256MB, WinXP Pro. Python 2.2.1, Ruby 1.7.2-4 - the Pragmatic
distribution.
The analyzed file is about 420 Mbytes and python does it in about 60 sec and ruby in about
115
sec.
Have some suggestion how to speed the ruby code?Regards
Tom
Do you Yahoo!?
New DSL Internet Access from SBC & Yahoo!
http://sbc.yahoo.com–
Best regards,
Joseph mailto:joe@vpop.net
Do you Yahoo!?
New DSL Internet Access from SBC & Yahoo!