Performance test: 1.8.0p2 versus 1.6.8

Thanks! I did try it and got just a 10% improvement. The 1.6.8 version is still faster.

  1. Ruby 1.6.8 under Linux (fields=line.split(“;”, -1); n_fields+=fields.length) ==> 57 seconds
  2. Ruby 1.8.0p2 under Linux (fields=line.split(“;”, -1); n_fields+=fields.length) ==> 83 seconds
  3. Ruby 1.6.8 under Linux (fields=line.split(/;/, -1); n_fields+=fields.length) ==> 59 seconds
  4. Ruby 1.8.0p2 under Linux (fields=line.split(/;/, -1); n_fields+=fields.length) ==> 65 seconds
  5. Python 2.2.2 under Linux (fields=line.split(“;”); n_fields+=len(fields)) ==> 54 seconds

So, your suggestion did improve performance but nevertheless 1.8.0p2 is still slower than 1.6.8 and I believe it shouldn’t! Could it be the result of increased garbage collection on the 1.8.0p2 version? The file I’m processing is reasonably large: 400MB with 3.5 million lines! BTW how can I obtain the time spent by Ruby on garbage collection and the number of times it was executed?

Regards,

José A. S. Alegria

···

-----Original Message-----
From: nobu.nokada@softhome.net [mailto:nobu.nokada@softhome.net]
Sent: terça-feira, 22 de Abril de 2003 11:21 PM
To: ruby-talk ML
Subject: Re: Performance test: 1.8.0p2 versus 1.6.8

Hi,

At Wed, 23 Apr 2003 06:08:20 +0900, José A. S. Alegria jose.alegria@netcabo.pt wrote:

So, at least for these very very basic operations Ruby compares well
with Python. Any idea why Ruby 1.8.0p2 is slower than version 1.6.8
for such a basic operation like string.split(“;”, -1) ?

It needs extra Regexp.quote call in 1.8. Try with string.split(/;/, -1).


Nobu Nakada

IRC some tunings in the GC were modified in 1.8, which could give wildly
different results in programs which rely heavily on GC (like yours).

In order to get the time spent collecting, you could build Ruby with
profiling information and profile the interpreter itself.

···

On Wed, Apr 23, 2003 at 09:10:28PM +0900, José Santos Alegria wrote:

Thanks! I did try it and got just a 10% improvement. The 1.6.8 version is still faster.

  1. Ruby 1.6.8 under Linux (fields=line.split(“;”, -1); n_fields+=fields.length) ==> 57 seconds
  2. Ruby 1.8.0p2 under Linux (fields=line.split(“;”, -1); n_fields+=fields.length) ==> 83 seconds
  3. Ruby 1.6.8 under Linux (fields=line.split(/;/, -1); n_fields+=fields.length) ==> 59 seconds
  4. Ruby 1.8.0p2 under Linux (fields=line.split(/;/, -1); n_fields+=fields.length) ==> 65 seconds
  5. Python 2.2.2 under Linux (fields=line.split(“;”); n_fields+=len(fields)) ==> 54 seconds

So, your suggestion did improve performance but nevertheless 1.8.0p2 is still slower than 1.6.8 and I believe it shouldn’t! Could it be the result of increased garbage collection on the 1.8.0p2 version? The file I’m processing is reasonably large: 400MB with 3.5 million lines! BTW how can I obtain the time spent by Ruby on garbage collection and the number of times it was executed?


_ _

__ __ | | ___ _ __ ___ __ _ _ __
'_ \ / | __/ __| '_ _ \ / ` | ’ \
) | (| | |
__ \ | | | | | (| | | | |
.__/ _,
|_|/| || ||_,|| |_|
Running Debian GNU/Linux Sid (unstable)
batsman dot geo at yahoo dot com

How do you power off this machine?
– Linus, when upgrading linux.cs.helsinki.fi, and after using the machine for several months

i’m not sure how smart it would be with a file that large, but you could
compare the times of your program with one begining with

GC.disable

in order to guestimate the percentage of time actually taken by the GC.

i really don’t know how accurate that would be, but if the difference was
significant you should be able to see some difference - either that or
GC.disable - doesn’t.

-a

···

On Wed, 23 Apr 2003, [iso-8859-1] José Santos Alegria wrote:

Thanks! I did try it and got just a 10% improvement. The 1.6.8 version is still faster.

  1. Ruby 1.6.8 under Linux (fields=line.split(“;”, -1); n_fields+=fields.length) ==> 57 seconds
  2. Ruby 1.8.0p2 under Linux (fields=line.split(“;”, -1); n_fields+=fields.length) ==> 83 seconds
  3. Ruby 1.6.8 under Linux (fields=line.split(/;/, -1); n_fields+=fields.length) ==> 59 seconds
  4. Ruby 1.8.0p2 under Linux (fields=line.split(/;/, -1); n_fields+=fields.length) ==> 65 seconds
  5. Python 2.2.2 under Linux (fields=line.split(“;”); n_fields+=len(fields)) ==> 54 seconds

So, your suggestion did improve performance but nevertheless 1.8.0p2 is
still slower than 1.6.8 and I believe it shouldn’t! Could it be the result
of increased garbage collection on the 1.8.0p2 version? The file I’m
processing is reasonably large: 400MB with 3.5 million lines! BTW how can I
obtain the time spent by Ruby on garbage collection and the number of times
it was executed?

Ara Howard
NOAA Forecast Systems Laboratory
Information and Technology Services
Data Systems Group
R/FST 325 Broadway
Boulder, CO 80305-3328
Email: ara.t.howard@fsl.noaa.gov
Phone: 303-497-7238
Fax: 303-497-7259
====================================