[ANN] RSRuby 0.4

Hi All,

I have released a new version of RSRuby on rubyforge today:

http://rubyforge.org/projects/rsruby/

RSRuby allows the user to embed a full R interpreter into a Ruby script. This allows the script to call any R function and convert the result back into Ruby. From the R website: 'R is a free software environment for statistical computing and graphics.'. Running a students t-test (or any other R function) is as simple as:

   require 'rsruby'
   r=RSRuby.instance #Create R interpreter
   ttest = r.t_test([1,2,3]) #Convert [1,2,3] to R. Run t.test function and convert result back to Ruby
   puts ttest['p.value'] #Prints out p.value statistic from the ttest object

In this new release, the whole codebase has been moved over to an RPy (http://rpy.sourceforge.net/) derived core, which has resulted in some changes and new features as well as improved stability. Documentation is still patchy, though there are some examples included in the release and an almost complete conversion of the RPy test suite. A better manual/tutorial is on my list of things for the next release, until then the RPy manual may help.

Dr Alex Gutteridge
Post-Doctoral Researcher

Bioinformatics Center
Institute for Chemical Research
Kyoto University
Gokasho, Uji, Kyoto 611-0011
Japan

insanely cool stuff alex!

-a

···

On Mon, 16 Oct 2006, Alex Gutteridge wrote:

Hi All,

I have released a new version of RSRuby on rubyforge today:

http://rubyforge.org/projects/rsruby/

RSRuby allows the user to embed a full R interpreter into a Ruby script. This allows the script to call any R function and convert the result back into Ruby. From the R website: 'R is a free software environment for statistical computing and graphics.'. Running a students t-test (or any other R function) is as simple as:

require 'rsruby'
r=RSRuby.instance #Create R interpreter
ttest = r.t_test([1,2,3]) #Convert [1,2,3] to R. Run t.test function and convert result back to Ruby
puts ttest['p.value'] #Prints out p.value statistic from the ttest object

In this new release, the whole codebase has been moved over to an RPy (http://rpy.sourceforge.net/\) derived core, which has resulted in some changes and new features as well as improved stability. Documentation is still patchy, though there are some examples included in the release and an almost complete conversion of the RPy test suite. A better manual/tutorial is on my list of things for the next release, until then the RPy manual may help.

Dr Alex Gutteridge
Post-Doctoral Researcher

Bioinformatics Center
Institute for Chemical Research
Kyoto University
Gokasho, Uji, Kyoto 611-0011
Japan

--
my religion is very simple. my religion is kindness. -- the dalai lama

Alex Gutteridge wrote:

RSRuby allows the user to embed a full R interpreter into a Ruby script.

Please add an entry for RSRuby to RAA so that it can be found easily. I suggest putting the entry into the Math subcategory of the Library category because that category already contains several pointers to statistics-related software. The question is if "library" correctly describes what RSRuby is but I consider the term more appropriate than "application". Besides that: 有り難う御座います先生 :slight_smile:

Josef 'Jupp' Schugt

RSRuby allows the user to embed a full R interpreter into a Ruby
script. This allows the script to call any R function and convert the
result back into Ruby. From the R website: 'R is a free software
environment for statistical computing and graphics.'. Running a
students t-test (or any other R function) is as simple as:

   require 'rsruby'
   r=RSRuby.instance #Create R interpreter
   ttest = r.t_test([1,2,3]) #Convert [1,2,3] to R. Run t.test
function and convert result back to Ruby
   puts ttest['p.value'] #Prints out p.value statistic from the
ttest object

That looks really great.

However, I have a couple of points about the conversion:

- R Logicals (true/false) <=> Ruby true/false
- R Integers <=> Ruby Fixnum/Bignum
- R Numeric <=> Ruby Float
- R String <=> Ruby String

These guys are all vectors in R - do you mean vectors of length one
are converted to the corresponding Ruby primitives? (Also,
confusingly, in R a string is called a character vector of length 1)

- R Vector <=> Ruby Array (homogeneous)

There isn't really a vector "class" in R, independent of the things
listed above.

- R List <=> Ruby Hash

I think a better mapping would be to Ara's ArrayFields class. Lists
in R can be accessed by name or by position.

Regards,

Hadley

   require 'rsruby'
   r=RSRuby.instance #Create R interpreter
   ttest = r.t_test([1,2,3]) #Convert [1,2,3] to R. Run t.test

Also, calling t_test instead of t.test is a bit worrying, suggesting
that you are automatically converting _ to . - not a good idea!

Hadley

Yeah ... maybe this will save me the trouble of getting the SWIG stuff
working. Actually, though, I have been able to build a SWIG-wrapped R
shared library, after about a week of wrestling with the header files,
but I haven't had a chance to try out any test cases, so I have no idea
what it will do.

In the long run, I think I still want to do the SWIG wrapping, because,
at least for simple libraries, it's scripting-language agnostic. You do
one set of interface files and you can access your library from Ruby,
Python, Perl, PHP (4), a couple of Schemes, Lua, Pike (whatever *that*
is), Java and one variant of Common Lisp (clisp IIRC).

Incidentally, my version of R also builds a shared LAPACK. I think it's
the FORTRAN-callable version, though, not the C-callable version. I
haven't been able to find a full Ruby/LAPACK interface. There's a piece
of it in the Ruby GSL package, and another version in something called
"linalg". I haven't been able to get "linalg" to build, though.

···

ara.t.howard@noaa.gov wrote:

On Mon, 16 Oct 2006, Alex Gutteridge wrote:

Hi All,

I have released a new version of RSRuby on rubyforge today:

http://rubyforge.org/projects/rsruby/

RSRuby allows the user to embed a full R interpreter into a Ruby
script. This allows the script to call any R function and convert the
result back into Ruby. From the R website: 'R is a free software
environment for statistical computing and graphics.'. Running a
students t-test (or any other R function) is as simple as:

require 'rsruby'
r=RSRuby.instance #Create R interpreter
ttest = r.t_test([1,2,3]) #Convert [1,2,3] to R. Run t.test
function and convert result back to Ruby
puts ttest['p.value'] #Prints out p.value statistic from the
ttest object

In this new release, the whole codebase has been moved over to an RPy
(http://rpy.sourceforge.net/\) derived core, which has resulted in some
changes and new features as well as improved stability. Documentation
is still patchy, though there are some examples included in the
release and an almost complete conversion of the RPy test suite. A
better manual/tutorial is on my list of things for the next release,
until then the RPy manual may help.

Dr Alex Gutteridge
Post-Doctoral Researcher

Bioinformatics Center
Institute for Chemical Research
Kyoto University
Gokasho, Uji, Kyoto 611-0011
Japan

insanely cool stuff alex!

-a

RSRuby allows the user to embed a full R interpreter into a Ruby
script. This allows the script to call any R function and convert the
result back into Ruby. From the R website: 'R is a free software
environment for statistical computing and graphics.'. Running a
students t-test (or any other R function) is as simple as:

   require 'rsruby'
   r=RSRuby.instance #Create R interpreter
   ttest = r.t_test([1,2,3]) #Convert [1,2,3] to R. Run t.test
function and convert result back to Ruby
   puts ttest['p.value'] #Prints out p.value statistic from the
ttest object

That looks really great.

However, I have a couple of points about the conversion:

- R Logicals (true/false) <=> Ruby true/false
- R Integers <=> Ruby Fixnum/Bignum
- R Numeric <=> Ruby Float
- R String <=> Ruby String

These guys are all vectors in R - do you mean vectors of length one
are converted to the corresponding Ruby primitives? (Also,
confusingly, in R a string is called a character vector of length 1)

Yes you are quite correct. In the basic (default) conversion mode vectors of length one are converted to Ruby primitives. However, in the 'vector' conversion mode an Array of length one is returned instead (closer to R semantics). E.g.

irb(main):001:0> require 'rsruby'
=> true
irb(main):002:0> r = RSRuby.instance
=> #<RSRuby:0xb7d11220>
irb(main):003:0> r.sum(1,2,3).class
=> Fixnum
irb(main):004:0> RSRuby.set_default_mode(RSRuby::VECTOR_CONVERSION)
=> 1
irb(main):005:0> r.sum(1,2,3).class
=> Array

See the RPy manual for a fuller discussion of the conversion modes. RSRuby pretty much uses an identical scheme.

- R Vector <=> Ruby Array (homogeneous)

There isn't really a vector "class" in R, independent of the things
listed above.

Yes - this is a simplification and indeed the documentation is slightly misleading here. In basic mode R vectors/lists of length > 1 that don't have a 'names' attribute are converted to Arrays.

- R List <=> Ruby Hash

Whereas if a 'names' attribute is included then the vector/list is converted to a Hash

I think a better mapping would be to Ara's ArrayFields class. Lists
in R can be accessed by name or by position.

I tentatively agree. My goal for this release was to implement the RPy conversion routines and test suite as faithfully as possible. In RPy named lists/vectors are converted to Python Dictionaries which are (exactly?) equivalent to Ruby Hashes so I kept to that scheme.

As a Ruby programmer first and an R programmer second my general philosophy was too try and force R concepts into Ruby (hence converting lists to Hashes) rather than vice-versa. Using something like ArrayFields gets us closer to R but at the cost of moving away from 'canonical' (i.e. standard library) Ruby. Clearly this is a balancing act between getting as close to R semantics as possible without moving too far from 'normal' Ruby. For me that balance point may lie closer to Ruby than for other users.

One final point: The RSRuby conversion routines can be customised by the user using the 'proc' and 'class' conversion modes. These conversion modes are quite powerful and are designed to allow the user to implement custom routines for any R/Ruby interconversion they want. Implementing a list <-> ArrayFields converter in the current system is (moderately) trivial. The code below implements the R -> Ruby side. One would need to write a suitable to_r method for Array to do the conversion the other way - left as an exercise for the reader :wink:

require 'rubygems'
require_gem 'arrayfields'
require 'rsruby'

test_proc = lambda{|x| #This lambda function is called on each object returned by R
   r = RSRuby.instance
   names = r.attr(x,'names') #It simply tests whether the 'names' attribute is present
   if names.nil?
     return false #returns false if names are not set
   else
     return true #returns true if they are
   end
}

conv_proc = lambda{|x| #If the above function returns true then this conversion routine is used
   r = RSRuby.instance #instead of the inbuilt RSRuby ones.
   names = r.attr(x,'names') #Retrieve the names
   hash = x.to_ruby #Convert the object (x) to Ruby - results in a Hash
   array = #But we want an ArrayField
   array.fields = names #Set the field names for the ArrayField
   names.each do |field| #Set the ArrayField values according to the values in the Hash
     array[field] = hash[field]
   end
   return array #Return the Array
}

r = RSRuby.instance #Start R
r.t_test.autoconvert(RSRuby::PROC_CONVERSION) #Set the t.test method to use proc conversion
r.proc_table[test_proc] = conv_proc #Setup the proc table. conv_proc is run if test_proc returns
                                               #true

ttest = r.t_test([1,2,3]) #Call t.test function - returns list in R
puts ttest.class #Normally list <=> Hash, but here it's an array!
ttest.each_pair do |field,val| #But not just any array - an array with fields!
   puts "#{field} - #{val}"
end

puts ttest[1..3] #That retains order!

Dr Alex Gutteridge
Post-Doctoral Researcher

Bioinformatics Center
Institute for Chemical Research
Kyoto University
Gokasho, Uji, Kyoto 611-0011
Japan

···

On 16 Oct 2006, at 22:37, hadley wickham wrote:

Hmmm. Yes 't_test' is automatically converted to 't.test'. Again, this is the system used by RPy, which I have been slavishly copying - perhaps it is non-optimal.

The current release doesn't have an option to turn off the automatic function name conversion but it would be very easy to add. Many (the majority?) of the R functions I've seen use '.' as a word separator, but I agree it is naive to think that all functions will do this. The automatic conversion will stay in though because it is much prettier (in my opinion) to write:

r.t_test([1,2,3])

than the alternative:

r['t.test'].call([1,2,3])

If you can think of another scheme for calling these kind of methods then I'd be happy to hear it.

Dr Alex Gutteridge
Post-Doctoral Researcher

Bioinformatics Center
Institute for Chemical Research
Kyoto University
Gokasho, Uji, Kyoto 611-0011
Japan

···

On 16 Oct 2006, at 22:39, hadley wickham wrote:

   require 'rsruby'
   r=RSRuby.instance #Create R interpreter
   ttest = r.t_test([1,2,3]) #Convert [1,2,3] to R. Run t.test

Also, calling t_test instead of t.test is a bit worrying, suggesting
that you are automatically converting _ to . - not a good idea!

Hadley

The current release doesn't have an option to turn off the automatic
function name conversion but it would be very easy to add. Many (the
majority?) of the R functions I've seen use '.' as a word separator,

Well, generally, the correct use of the . is for S3 methods, to
separate the function and class name. However, in the past, because _
was used for assignment, many methods were named t.test instead of the
more correct t_test (t.test is not the t method for the test class)

but I agree it is naive to think that all functions will do this. The
automatic conversion will stay in though because it is much prettier
(in my opinion) to write:

r.t_test([1,2,3])

than the alternative:

r['t.test'].call([1,2,3])

I agree that it is nicer, it will just seriously screw up anyone
trying to use methods containing _. No great ideas for a better
syntax though.

Hadley

Yes you are quite correct. In the basic (default) conversion mode
vectors of length one are converted to Ruby primitives. However, in
the 'vector' conversion mode an Array of length one is returned
instead (closer to R semantics). E.g.

irb(main):001:0> require 'rsruby'
=> true
irb(main):002:0> r = RSRuby.instance
=> #<RSRuby:0xb7d11220>
irb(main):003:0> r.sum(1,2,3).class
=> Fixnum
irb(main):004:0> RSRuby.set_default_mode(RSRuby::VECTOR_CONVERSION)
=> 1
irb(main):005:0> r.sum(1,2,3).class
=> Array

That seems a fair comprise to me, especially given that the precise R
semantics aren't well defined due to the default operation of [ to
drop dimensions of size 1.

> I think a better mapping would be to Ara's ArrayFields class. Lists
> in R can be accessed by name or by position.

I tentatively agree. My goal for this release was to implement the
RPy conversion routines and test suite as faithfully as possible. In
RPy named lists/vectors are converted to Python Dictionaries which
are (exactly?) equivalent to Ruby Hashes so I kept to that scheme.

That's a very reasonable goal. RSRuby really does look great and I'll
look forward to being able to use R from within ruby.

Hadley

Alex Gutteridge wrote:

The automatic conversion will stay in though because it is much prettier (in my opinion) to write:

r.t_test([1,2,3])

than the alternative:

r['t.test'].call([1,2,3])

Well, while harder, it is pretty easy to write an API that responds to:
   r.t.test([1,2,3])
assuming that "t" isn't some sort of R function. (See FlexMock for an example API.)

Now, I know zero about R, so.... bleh.

Devin

Well, while harder, it is pretty easy to write an API that responds to:
   r.t.test([1,2,3])
assuming that "t" isn't some sort of R function. (See FlexMock for an
example API.)

Unfortunately, in this case, t is a R function (but not in general).

Hadley