Ruby Regexp implementation?

This probably doesn't belong in Ruby Core. I was reading this article:
http://swtch.com/~rsc/regexp/regexp1.html
http://swtch.com/~rsc/regexp/regexp2.html
http://swtch.com/~rsc/regexp/regexp3.html

And in the first few lines of the first page, it shows that the Regexp
implementation for Perl is absolutely terrible, and it says that Ruby is
"the same" or at least similar. Is Ruby 1.8 / 1.9 using the Thompson NFA
matching? Will Ruby 2.0 use it?

···

--
Posted via http://www.ruby-forum.com/.

I remember reading that article when it was first published.

Ruby switched to a new regex library almost 2 years ago. Check out http://rubyforge.org/projects/oniguruma for more information.

It would be interesting to see those benchmarks run again with these new implementations.

cr

···

On Mar 12, 2010, at 12:54 PM, Aldric Giacomoni wrote:

This probably doesn't belong in Ruby Core. I was reading this article:
Regular Expression Matching Can Be Simple And Fast
Regular Expression Matching: the Virtual Machine Approach
Regular Expression Matching in the Wild

And in the first few lines of the first page, it shows that the Regexp
implementation for Perl is absolutely terrible, and it says that Ruby is
"the same" or at least similar. Is Ruby 1.8 / 1.9 using the Thompson NFA
matching? Will Ruby 2.0 use it?

Aldric Giacomoni wrote:

This probably doesn't belong in Ruby Core. I was reading this article:
Regular Expression Matching Can Be Simple And Fast
Regular Expression Matching: the Virtual Machine Approach
Regular Expression Matching in the Wild

And in the first few lines of the first page, it shows that the Regexp
implementation for Perl is absolutely terrible, and it says that Ruby is
"the same" or at least similar. Is Ruby 1.8 / 1.9 using the Thompson NFA
matching? Will Ruby 2.0 use it?

def pathologicalize (str, n)
  tmp = str[0]*n
  [str*n + tmp, tmp]
end

puts RUBY_VERSION
1.step(101, 10) do |n|
  full_str, part_str = pathologicalize("a?", n)
  start = Time.now
  1000.times do
    full_str =~ /#{part_str}/
  end
  duration = Time.now-start
  puts "#{n}:\t#{duration}"
end

#output
1.9.1
1: 0.015625
11: 0.015625
21: 0.015625
31: 0.03125
41: 0.03125
51: 0.046875
61: 0.046875
71: 0.046875
81: 0.0625
91: 0.0625
101: 0.078125

Siep

···

--
Posted via http://www.ruby-forum.com/\.

Hi,

···

In message "Re: Ruby Regexp implementation?" on Sat, 13 Mar 2010 03:54:09 +0900, Aldric Giacomoni <aldric@trevoke.net> writes:

And in the first few lines of the first page, it shows that the Regexp
implementation for Perl is absolutely terrible, and it says that Ruby is
"the same" or at least similar. Is Ruby 1.8 / 1.9 using the Thompson NFA
matching? Will Ruby 2.0 use it?

No plan. We need more (human) resource.

              matz.