It seems like most of the time would be spent loading the environment and
printing the output, making it difficult to compare regexp speeds.
Sure; so why not do it 1000 times:
#!/usr/bin/ruby
1000.times do
File.open("index.html").each do |c|
puts $1 if /href="http:\/\/(.*?)\/.*" target="_blank"/ =~ c
end
end
time ./test.rb >/tmp/t
elap 6.511 user 6.336 syst 0.136 CPU 99.40%
#!/usr/bin/perl
for ($i=0; $i<1000; $i+=1) {
open HD,"index.html" or die $!;
while(<HD>) {
print $1,"\n" if /href="http:\/\/(.*?)\/.*" target="_blank"/;
}
close HD;
}
time ./test.pl >/tmp/t
elap 0.864 user 0.844 syst 0.020 CPU 100.04%
"Compiling" regular expression does not bring any advantages. In fact, usually it's slower than using the regular expression inline as you did in your first example. The Ruby interpreter optimizes this already.
If the speed difference does not bother it why bother discussing it?
Btw, I'd probably formulate the regexp differently in order to avoid ".*?" which could be slow. Also, if you have a lot of slashes in the regexp the %r form comes in handy because you do not need all the escapes:
File.foreach "index.html" do |line|
puts $1 if %r{href="http://([^“/])/[^"]”\s+target="_blank"} =~ line
end
Kind regards
robert
···
On 01/04/2010 10:22 AM, Ruby Newbee wrote:
On Mon, Jan 4, 2010 at 5:07 PM, Ayumu Aizawa <ayumu.aizawa@gmail.com> wrote:
Hi Jenn.
Its interested
How's it?
Thanks for the reminding, I got your meanings.
This time I used a compiled regex for both ruby and perl, the speed is
still different:
Thank you. It's been far too long since I've read Coding Horror.
Although it reminds me ,I should bug one of my PhD candidate friends
for some perl code I counseled him to fix. He was parsing a 500MB+
csv file with getlines and string compares and splits in perl..... I
think he literally banged his head on the table when I introduced him
to CPAN and showed him CSV libraries...
···
On Mon, Jan 4, 2010 at 10:50 AM, Rilindo Foster <rilindo@gmail.com> wrote:
You could try ruby 1.9 and see if it helps the speed.
not very.
I get best results in Ruby with:
regexp = %r{href="http://([^“/])/[^"]”\s+target="_blank"}
1000.times do
puts File.read('index.html').scan(regexp)
end
~/ruby/bench time ruby19 regex.rb > /dev/null
real 0m1.428s
user 0m1.359s
sys 0m0.056s
~/ruby/bench time perl5.10.0 regex.pl > /dev/null
real 0m1.189s
user 0m1.095s
sys 0m0.084s
It's still slower. Perl has regular expression magic beyond my
imagination, though. I heard they take the most "rare" character in the
literal part of the regex (let's say, the colon) and search for it using
machine code, and then work their way backwards to the beginning of the
regexp...
Say what you want, but Perl rocks when it comes to text processing
speed.
Python is even faster:
import re
regexp = re.compile(r'href="http://([^“/])/[^"]”\s+target="_blank"')
for i in xrange(1000):
with open("index.html") as f:
for m in regexp.finditer(f.read()):
print m.group(1)
time python2.6 regex.py > /dev/null
real 0m0.943s
user 0m0.880s
sys 0m0.053s
It's still slower. Perl has regular expression magic beyond my
imagination, though. I heard they take the most "rare" character in the
literal part of the regex (let's say, the colon) and search for it using
machine code, and then work their way backwards to the beginning of the
regexp...
I think that's only done when study is called, but I could be wrong.
Say what you want, but Perl rocks when it comes to text processing
speed.
Python is even faster:
import re
regexp = re.compile(r'href="http://([^“/])/[^"]”\s+target="_blank"')
for i in xrange(1000):
with open("index.html") as f:
for m in regexp.finditer(f.read()):
print m.group(1)
time python2.6 regex.py > /dev/null
real 0m0.943s
user 0m0.880s
sys 0m0.053s
Yeah. I love Ruby, but I'm getting a bit annoyed by the fact that it's
so much slower than Python...
The question is: does it matter for most practical purposes - and: do you want to sacrifice a clean and simple program and the fun of creating it for a few cycles of CPU time? I wouldn't - especially since 1.9 is so much faster than 1.8 was. My 0.02EUR.
Kind regards
robert
···
On 01/05/2010 12:37 PM, Marnen Laibow-Koser wrote:
Yeah. I love Ruby, but I'm getting a bit annoyed by the fact that it's so much slower than Python...
Yeah. I love Ruby, but I'm getting a bit annoyed by the fact that it's
so much slower than Python...
The question is: does it matter for most practical purposes - and: do
you want to sacrifice a clean and simple program and the fun of creating
it for a few cycles of CPU time?
No. That's why I haven't learned Python yet, although between the speed
increase and GAE, it's sometimes tempting. But I'd really miss the
beautiful design of Ruby.
But my point was a bit different. Python and Ruby are basically similar
languages, and what annoys me is that there seems not to have been the
will in the Ruby community to steal some speed tricks from Python. (I'd
be working on this if I knew anything practical about language
implementation, but I don't.)
I wouldn't - especially since 1.9 is
so much faster than 1.8 was. My 0.02EUR.
Unfortunately, I don't quite trust 1.9 for use with Rails yet...
Kind regards
robert
Best,
···
On 01/05/2010 12:37 PM, Marnen Laibow-Koser wrote:
On 01/05/2010 12:37 PM, Marnen Laibow-Koser wrote:
Yeah. I love Ruby, but I'm getting a bit annoyed by the fact that it's
so much slower than Python...
The question is: does it matter for most practical purposes - and: do
you want to sacrifice a clean and simple program and the fun of creating
it for a few cycles of CPU time? I wouldn't - especially since 1.9 is
so much faster than 1.8 was. My 0.02EUR.
Why does everybody say that CPUs are fast nowadays and that "it dosn't
mattar that language XYZ is slow"?
It does matter: web applications. If your applications can't serve all
the visitors, then you're going to lose your customer or you'll have to
learn some other language with better performance.
--
Posted via http://www.ruby-forum.com/\.
But my point was a bit different. Python and Ruby are basically similar
languages, and what annoys me is that there seems not to have been the
will in the Ruby community to steal some speed tricks from Python. (I'd
be working on this if I knew anything practical about language
implementation, but I don't.)
Yeah no kidding. Somehow speed just hasn't "felt" like the ruby
community's thing, until 1.9 at least.
I am working on a few projects to make it faster [and I suppose the
macruby, rubinius and jruby guys, are, as well].
Unfortunately, I don't quite trust 1.9 for use with Rails yet...