Hmm. Simplifying my test script further, I am not sure that Regexp is the problem at all!
With the each_line block, my script take more than TWICE as long in 1.9 vs. 1.8.
But without the each_line block, but keeping the Regexp, it is 10% FASTER.
So unless there is some internal optimisation that occurs when the block is removed, it looks like each_line is the problem, not Regexp???
require 'benchmark'
include Benchmarklogfile="23:59:16 drop 10.14.241.252 >eth2c1 rule: 1015; rule_uid: {6AADF426-0D0C-4C20-A027-06A6DC8C6CE2}; src: 172.25.20.79; dst: 10.14.65.137; proto: tcp; product: VPN-1 & FireWall-1; service: lotus; s_port: 57150;"
bm(12) do |test|
test.report('WITH each_line:') do
500000.times do
logfile.each_line do |line|
line.match /src: (.*?);/
end
end
end
test.report('WITHOUT each_line:') do
500000.times do
logfile.match /src: (.*?);/
end
end
end
$ ruby logreport3.rb
user system total real
WITH each_line: 1.710000 0.000000 1.710000 ( 1.717034)
WITHOUT each_line: 1.080000 0.000000 1.080000 ( 1.077098)
$ ruby19 logreport3.rb
···
user system total real
WITH each_line: 3.680000 0.000000 3.680000 ( 3.680009)
WITHOUT each_line: 0.890000 0.000000 0.890000 ( 0.893182)