1.9.1 regex 6.5 times slower than 1.8.6 in at least one case

Hello:

In general I've found Ruby 1.9.1 to be 2 times faster than 1.8.6 and roughly the same speed as Python 2.8 which is great... thanks guys, great job!

However, in one case, when using a regular expressions (posted here by someone a long time ago) which I use to determine what numbers in 0..10000 are prime numbers, version 1.9.1 was at least 6.5 times slower (1.8.6 = 67 secs, 1.9.1 = 457 secs).

The regular express is:

     ((("1" * self) =~ /^1$|^(11+?)\1+$/) == nil)

which I've used in different versions of my program running either on its own or as part of an overloading function as follows:

   # Add an "is_prime?" method to the built-in numeric class
   # which returns true if the number is a prime number.
   class Fixnum
     def is_prime?
       ((("1" * self) =~ /^1$|^(11+?)\1+$/) == nil)
     end
   end

I also have a version of the prime-number calculation program that which doesn't use the above regex (i.e. it uses a traditional brute force approach instead) and it runs 2 times faster in 1.9.1. In 1.9.1 the regex gets exponentially worse as the number being evaluated exceeding 1000.

Does anyone know why the regex in 1.9.1 is so much slower than 1.8.6?

Thank You,

Michael

Michael Brooks wrote:

However, in one case, when using a regular expressions (posted here by someone a long time ago) which I use to determine what numbers in 0..10000 are prime numbers, version 1.9.1 was at least 6.5 times slower (1.8.6 = 67 secs, 1.9.1 = 457 secs).

The regular express is:

    ((("1" * self) =~ /^1$|^(11+?)\1+$/) == nil)

As the only guy who would rather use a regex rather than string slicing, that's disheartening news. One thing you might want to check is your encoding. If your default encoding is UTF8, some string operations can be significantly slower:

$ cat p-regex.rb
4000.times do |i|
   ((("1" * i) =~ /^1$|^(11+?)\1+$/) == nil)
end

$ time 1.8/bin/ruby -KN -v p-regex.rb
ruby 1.8.6 (2008-08-11 patchlevel 287) [i686-linux]
real 0m4.411s
user 0m4.292s
sys 0m0.032s

$ time 1.8/bin/ruby -KU -v p-regex.rb
ruby 1.8.6 (2008-08-11 patchlevel 287) [i686-linux]
real 0m4.480s
user 0m4.320s
sys 0m0.004s

$ time 1.9/bin/ruby -KN -v p-regex.rb
ruby 1.9.1p0 (2009-02-22 revision 22551) [i686-linux]
real 0m8.041s
user 0m7.980s
sys 0m0.020s

$ time 1.9/bin/ruby -KU -v p-regex.rb
ruby 1.9.1p0 (2009-02-22 revision 22551) [i686-linux]
real 0m21.709s
user 0m20.913s
sys 0m0.032s

With ascii encoding, ruby1.9 is still slower than 1.8 but at least not six times slower.

···

--
Daniel

Michael Brooks wrote:

However, in one case, when using a regular expressions (posted here by someone a long time ago) which I use to determine what numbers in 0..10000 are prime numbers, version 1.9.1 was at least 6.5 times slower (1.8.6 = 67 secs, 1.9.1 = 457 secs).

I found similar results lately, but the difference was more like a
factor of 12!

RUBY 1.8.6:

$ cd FUN3D
$ find . -name \*_c.\?90 -exec rm {} \;
$ time ruby Ruby/complex_transformations_spike.rb
[...]
real 0m25.221s
user 0m22.292s
sys 0m0.759s

$ ruby -v
ruby 1.8.6 (2008-03-03 patchlevel 114) [universal-darwin9.0]
(as comes with OS X)

RUBY 1.9.1:

$ cd /usr/local/src
$ wget ftp://ftp.ruby-lang.org/pub/ruby/1.9/ruby-1.9.1-p0.tar.gz
$ tar zxf ruby-1.9.1-p0.tar.gz
$ cd ruby-1.9.1-p0
$ ./configure --program-suffix=19 --enable-shared
$ make
$ sudo make install
$ cd -

$ time ruby19 Ruby/complex_transformations_spike.rb
[...]
real 5m18.188s
user 4m54.253s
sys 0m2.866s

$ ruby19 -v
ruby 1.9.1p0 (2009-01-30 revision 21907) [i386-darwin9.6.0]

Regards,

···

--
Bil Kleb
http://fun3d.larc.nasa.gov

// I’m slowly trawling through my ruby-talk backlog, so apologies
// if the below is irrelevant to anyone at this point in time. :slight_smile:

Michael Brooks:

However, in one case, when using a regular expressions (posted here by
someone a long time ago) which I use to determine what numbers in
0..10000 are prime numbers, version 1.9.1 was at least 6.5 times
slower (1.8.6 = 67 secs, 1.9.1 = 457 secs).

The regular express is:

    ((("1" * self) =~ /^1$|^(11+?)\1+$/) == nil)

which I've used in different versions of my program running either
on its own or as part of an overloading function as follows:

  # Add an "is_prime?" method to the built-in numeric class
  # which returns true if the number is a prime number.
  class Fixnum
    def is_prime?
      ((("1" * self) =~ /^1$|^(11+?)\1+$/) == nil)
    end
  end

Note that this is a really ineffective way of testing primarity:

ruby 1.9.1p0 (2009-01-30 revision 21907) [i686-linux]

class Fixnum
  def is_prime?
    ((("1" * self) =~ /^1$|^(11+?)\1+$/) == nil)
  end
end

=> nil

require 'prime' # adds Fixnum#prime? (among others)

=> true

start = Time.now; (0..1000).select(&:is_prime?); Time.now - start

=> 0.405157579

start = Time.now; (0..1000).select(&:prime?); Time.now - start

=> 0.016556705

start = Time.now; (0..5000).select(&:is_prime?); Time.now - start

=> 30.165383712

start = Time.now; (0..5000).select(&:prime?); Time.now - start

=> 0.105887884

Note that the Prime class has two bugs (as of 1.9.1-p0, both will be
fixed in the next 1.9.1 release), but your regex also has one of them:

0.is_prime?

=> true # BUG

1.is_prime?

=> false

0.prime?

=> true # BUG

1.prime?

=> true # BUG

(Neither 0 nor 1 are prime numbers.)

I also have a version of the prime-number calculation program that
which doesn't use the above regex (i.e. it uses a traditional brute
force approach instead) and it runs 2 times faster in 1.9.1.

I highly recommend http://ruby-doc.org/core-1.9/ → Prime class (et al.).

— Shot

···

--

The Ruby community should proceed with all deliberate
speed towards ISO standardization of the language.

Yeah, look what it did to Forth.

Don’t just say it, show it. http://vividpicture.com/aleks/atari/forth.jpg
        [M. Edward (Ed) Borasky, Matt Lawrence, Gregory Brown, ruby-talk]

Daniel DeLorme wrote:

Michael Brooks wrote:

However, in one case, when using a regular expressions (posted here by someone a long time ago) which I use to determine what numbers in 0..10000 are prime numbers, version 1.9.1 was at least 6.5 times slower (1.8.6 = 67 secs, 1.9.1 = 457 secs).

The regular express is:

    ((("1" * self) =~ /^1$|^(11+?)\1+$/) == nil)

As the only guy who would rather use a regex rather than string slicing, that's disheartening news. One thing you might want to check is your encoding. If your default encoding is UTF8, some string operations can be significantly slower:

$ cat p-regex.rb
4000.times do |i|
   ((("1" * i) =~ /^1$|^(11+?)\1+$/) == nil)
end

$ time 1.8/bin/ruby -KN -v p-regex.rb
ruby 1.8.6 (2008-08-11 patchlevel 287) [i686-linux] > real 0m4.411s

$ time 1.8/bin/ruby -KU -v p-regex.rb
ruby 1.8.6 (2008-08-11 patchlevel 287) [i686-linux] > real 0m4.480s

$ time 1.9/bin/ruby -KN -v p-regex.rb
ruby 1.9.1p0 (2009-02-22 revision 22551) [i686-linux] > real 0m8.041s

$ time 1.9/bin/ruby -KU -v p-regex.rb
ruby 1.9.1p0 (2009-02-22 revision 22551) [i686-linux] > real 0m21.709s

With ascii encoding, ruby1.9 is still slower than 1.8 but at least not six times slower.

Daniel

Hello Daniel:

Interesting, I didn't know about those switches.

I ran my 10000 iteration prime number calc program using ruby 1.9.1 with no switch then with the two switches you identified:

ruby "fprim v5.rb" > 477.47 seconds
ruby -KN "fprim v5.rb" > 500.89 seconds
ruby -KU "fprim v5.rb" > 1059.04 seconds

Each test was run twice and the lower number from each reported above. I'm not sure why it's slower than the numbers I first reported.

I'm running Windows XP (which I know is slower than Linux and the default Windows compiled binaries for 1.9.1 might be having some impact too) on an unplugged AMD 1.9 GHz based laptop (it runs at about 1.1 GHz when unplugged) with 2 GB ram. I'm not sure what my default ruby encoding is but it appears to be a bit faster than the -KN switch.

To put those numbers in perspective, if I run your code on my unplugged laptop with the -KN switch it takes 37.07 seconds and without any switch 36.85 seconds. If I change your code to use 10000 iterations and run it with the -KN switch it takes 477.36 seconds and without any switch 500.44 seconds (which is strange because it's a flip of numbers in the fprim test).

All I can say regarding regex performance and 1.9.1 is OUCH.

Michael