Ruby extremely slow compared to PHP

Hello there, how are you? Hope you are fine. I am a PHP programmer
trying to switch to Ruby(not for web development) because it seems easy
and what takes about 20 lines of code in PHP takes a lot fewer lines in
Ruby. However, the only thing I don't like about Ruby is its syntax.

A few weeks ago I've coded a PHP script that will take as input a LFI
vulnerable url(http://en.wikipedia.org/wiki/Local_File_Inclusion) and
will find web server logs in less than 10 seconds(that's been the
average execution time).

This week I've started to learn Ruby and for practising I wanted to
"translate" my PHP script to Ruby. The script ran successfully in Ruby
but with one problem: the script that was executed in 10 seconds in PHP,
took around 150 seconds to execute successfully in Ruby. Here is the
script: http://pastebin.com/US0wvLtR

I would like some advice from you on how to optimise my script,
regarding the execution time and any other thing that looks weird for
you.

I have been testing the script on this url
http://www.eluth.com/extras/update.php?read_me=0&readme_file= and please
take into account that the URL(ARGV[0]) needs to be encoded.

Regards!

P.S.: please don't take this post as a comparison between PHP and Ruby
or a "war" between these two languages.

···

--
Posted via http://www.ruby-forum.com/.

before you are talking about speed, talk about what ruby version you use
1.8 is very slow compared to 1.9 or 2.0 AND 1.8 will die this month

···

--
Posted via http://www.ruby-forum.com/.

You sure you want this:

   ret = "../"
   url = url + ret

Doesn't make sense to me.

···

--
<https://github.com/stomar/>

It would be really helpful to me if you'd also post the PHP version.

···

Mick Jagger <lists@ruby-forum.com> wrote:

This week I've started to learn Ruby and for practising I wanted to
"translate" my PHP script to Ruby. The script ran successfully in Ruby
but with one problem: the script that was executed in 10 seconds in PHP,
took around 150 seconds to execute successfully in Ruby. Here is the
script: #!/usr/bin/env rubyrequire 'net/http'require 'uri'start = Time.nowde - Pastebin.com

For those asking for the PHP version, here it is:
http://pastebin.com/3GX0CicS

There are three different scripts there: lfi.php, funcs.php and dirs.php

Thank you!

···

--
Posted via http://www.ruby-forum.com/.

So, basically, curl is getting a lot less data than Net::HTTP is
getting. I haven't delved into why this is so, but these are interesting
stats:

This is for the ruby version:

Total fetch time: 28 seconds.
Total fetch size: 1954452 bytes.

real 0m28.175s
user 0m0.224s
sys 0m0.020s

And this is for the PHP version:

Total fetch time: 18 seconds.
Total fetch bytes: 224888

real 0m18.077s
user 0m0.048s
sys 0m0.028s

Look at the difference in the amount of data transferred. Net::HTTP is
pulling in *eight times* as much data as curl is. Why is this the case?
That might be worth more investigation...

But also look at the differences shown; the algorithms themselves take
nearly no time, which is to be expected.

At any rate (haha, p.i.), you definitely can not make the case that ruby
is slower than php from this, as all the time is spent in transfer, not
execution.

P.S. "Mick": I had to clean up your PHP script a bit as it was tossing
warnings and notices all over. Your ruby script, OTOH, was flawless. I
did change a couple things in order to get the totals, but not the
fundamental algorithms.

Hans Mackowiak wrote in post #1111042:

before you are talking about speed, talk about what ruby version you use
1.8 is very slow compared to 1.9 or 2.0 AND 1.8 will die this month

Hello, I ran ruby -v and got "ruby 1.9.3p194 (2012-04-20 revision 35410)
[i686-linux]"

···

--
Posted via http://www.ruby-forum.com/\.

unknown wrote in post #1111044:

You sure you want this:

   ret = "../"
   url = url + ret

Doesn't make sense to me.

So one problem could be "redundant" variables? Thank you!

···

--
Posted via http://www.ruby-forum.com/\.

My guess (totally untested) would be that curl/PHP is getting a gzipped version of the page whilst Net::HTTP is not.

···

On 03.06.2013 02:44, Tamara Temple wrote:

So, basically, curl is getting a lot less data than Net::HTTP is
getting. I haven't delved into why this is so, but these are interesting
stats:

This is for the ruby version:

Total fetch time: 28 seconds.
Total fetch size: 1954452 bytes.

real 0m28.175s
user 0m0.224s
sys 0m0.020s

And this is for the PHP version:

Total fetch time: 18 seconds.
Total fetch bytes: 224888

real 0m18.077s
user 0m0.048s
sys 0m0.028s

Look at the difference in the amount of data transferred. Net::HTTP is
pulling in *eight times* as much data as curl is. Why is this the case?
That might be worth more investigation...

But also look at the differences shown; the algorithms themselves take
nearly no time, which is to be expected.

At any rate (haha, p.i.), you definitely can not make the case that ruby
is slower than php from this, as all the time is spent in transfer, not
execution.

P.S. "Mick": I had to clean up your PHP script a bit as it was tossing
warnings and notices all over. Your ruby script, OTOH, was flawless. I
did change a couple things in order to get the totals, but not the
fundamental algorithms.

--
Alex Gutteridge

Tamara Temple wrote in post #1111074:

So, basically, curl is getting a lot less data than Net::HTTP is
getting. I haven't delved into why this is so, but these are interesting
stats:

This is for the ruby version:

Total fetch time: 28 seconds.
Total fetch size: 1954452 bytes.

real 0m28.175s
user 0m0.224s
sys 0m0.020s

And this is for the PHP version:

Total fetch time: 18 seconds.
Total fetch bytes: 224888

real 0m18.077s
user 0m0.048s
sys 0m0.028s

Look at the difference in the amount of data transferred. Net::HTTP is
pulling in *eight times* as much data as curl is. Why is this the case?
That might be worth more investigation...

But also look at the differences shown; the algorithms themselves take
nearly no time, which is to be expected.

At any rate (haha, p.i.), you definitely can not make the case that ruby
is slower than php from this, as all the time is spent in transfer, not
execution.

P.S. "Mick": I had to clean up your PHP script a bit as it was tossing
warnings and notices all over. Your ruby script, OTOH, was flawless. I
did change a couple things in order to get the totals, but not the
fundamental algorithms.

Thanks for your answer!

I think the PHP warnings are because I have disabled them, but think it
would be a good practice to turn them on again haha.

As you could have seen, in my PHP script, the cURL handle is not closed,
what improves a lot the request speed as I have tested. May be because
of this the Ruby script is slower.

So the "slowness" of the script is because of how connections are
handled I guess.

Regards.

···

--
Posted via http://www.ruby-forum.com/\.

I didn't mean that. You probably meant

   url = ret + url

(prepend '../' to the original url) ???

···

Am 02.06.2013 21:02, schrieb Mick Jagger:

unknown wrote in post #1111044:

You sure you want this:

    ret = "../"
    url = url + ret

Doesn't make sense to me.

So one problem could be "redundant" variables? Thank you!

--
<https://github.com/stomar/&gt;

You can do this with Net::HTTP as well, see the docs:

But a more apples-to-apples comparison would use a Ruby library that also
uses libcurl under the hood. I suggest Typhoeus:

···

On Mon, Jun 3, 2013 at 4:05 PM, Mick Jagger <lists@ruby-forum.com> wrote:

As you could have seen, in my PHP script, the cURL handle is not closed,
what improves a lot the request speed as I have tested.

Tamara Temple wrote in post #1111074:
> So, basically, curl is getting a lot less data than Net::HTTP is
> getting. I haven't delved into why this is so, but these are interesting
> stats:
>
>
> This is for the ruby version:
>
> Total fetch time: 28 seconds.
> Total fetch size: 1954452 bytes.
>
> real 0m28.175s
> user 0m0.224s
> sys 0m0.020s
>
>
>
> And this is for the PHP version:
>
> Total fetch time: 18 seconds.
> Total fetch bytes: 224888
>
> real 0m18.077s
> user 0m0.048s
> sys 0m0.028s
>
> Look at the difference in the amount of data transferred. Net::HTTP is
> pulling in *eight times* as much data as curl is. Why is this the case?
> That might be worth more investigation...
>
> But also look at the differences shown; the algorithms themselves take
> nearly no time, which is to be expected.
>
> At any rate (haha, p.i.), you definitely can not make the case that ruby
> is slower than php from this, as all the time is spent in transfer, not
> execution.
>
> P.S. "Mick": I had to clean up your PHP script a bit as it was tossing
> warnings and notices all over. Your ruby script, OTOH, was flawless. I
> did change a couple things in order to get the totals, but not the
> fundamental algorithms.

Thanks for your answer!

I think the PHP warnings are because I have disabled them, but think it
would be a good practice to turn them on again haha.

As you could have seen, in my PHP script, the cURL handle is not closed,
what improves a lot the request speed as I have tested. May be because
of this the Ruby script is slower.

The curl handle *is* closed -- it is a local variable (no globals
given). Plus, you are in fact reinitializing it every time at the top of
the two function calls.

//This functions returns body of $url
  function getBody($url){
    $ch = curl_init();

//This functions returns response size
  function getResponseSize($url){
    $ch = curl_init();

(These are from your pastbin entries.)

I'd be curious to see what you had when you thought you weren't actually
re-initializing, but that's way off topic for this group.

So the "slowness" of the script is because of how connections are
handled I guess.

I'm still going with compression vs. non-compressed.

···

Mick Jagger <lists@ruby-forum.com> wrote:

Forget it.

I had only time for a quick browsing of your code,
and that seemed odd to me, but of course you are right.

···

--
<https://github.com/stomar/>

Avdi Grimm wrote in post #1111195:

···

On Mon, Jun 3, 2013 at 4:05 PM, Mick Jagger <lists@ruby-forum.com>

Hi Avdi,

Nice to meet you here! I wanted to know,in your RubyTapas video
tutorial,what editor you used to explain the codes. That editor is
awesome. Could you please tell me the name?

Thanks

--
Posted via http://www.ruby-forum.com/\.

Well running the code against the example url the reason for the length of
time it took was these files:

16700151
http://www.eluth.com/extras/update.php?read_me=0&readme_file=../../../../usr/local/apache2/logs/access_log
429251
http://www.eluth.com/extras/update.php?read_me=0&readme_file=../../../../usr/local/apache2/logs/error_log
17092095
http://www.eluth.com/extras/update.php?read_me=0&readme_file=../../../../var/log/httpd/access_log
586331
http://www.eluth.com/extras/update.php?read_me=0&readme_file=../../../../var/log/httpd/error_log

Large files will necessarily take longer to download, all the rest were a
little over 1K. Which puts the bottleneck on the code which gets the page
(Net::HTTP.get) rather than the script overall.

However without the PHP version of the program to compare it against there
is no way to measure how much slower Ruby is against PHP. Is PHP really
that much faster at downloading a web page?

Emacs. But you're probably referring to the insertion of results into the
bugger, which is provided by xmpfilter from the rcodetools gem.