i am noticing ruby is much slower than perl when matching patterns with
two or more quantifier (+ or *). consider this string:
$ ruby -0 -ne ‘print “A”*3,"\n"*10000,“B”*10,"\n"*10000’ > str
and matching this string:
$ perl -0 -ne’print 1 if /A.+\n+B/s’ str # 0m1.682s
$ ruby -0 -ne’print 1 if /A.+\n+B/m’ str # 0m12.468s
$ perl -0 -ne’print 1 if /A.+\n.+\nB/s’ str # 0m1.368s
$ ruby -0 -ne’print 1 if /A.+\n.+\nB/m’ str # 0m11.427s
php (pcre) speed is somewhere in between.
of course, with patterns like /A.+\n+.+B/ or those with more quantifiers
both become very slow, but i suspect perl is still faster then ruby
especially when there are fewer “.+” and more “SINGLECHAR+” subpatterns.
i even have one case where perl takes less than a second and ruby
takes more than five minutes (i don’t know for sure, i interrupted the
process so ruby never finished).
note that i’m not implying anything though, because i understand perl’s
regex engine has undergone a long period of tweaking and optimization.
···
–
dave
ps: using ruby 1.6.7 vs perl 5.8.0 on linux