Things That Newcomers to Ruby Should Know (10/16/02)

Hi,

I just updated the list, which is also available in HTML format at
http://www.glue.umd.edu/~billtj/ruby.html.

Regards,

Bill

···

=============================================================================
Things That Newcomers to Ruby Should Know

[1]Plain Text Format
* Resources:
+ HOME PAGE: [2]http://www.ruby-lang.org/en/
+ FAQ: [3]http://www.rubycentral.com/faq/
+ PITFALL:
[4]http://rwiki.jin.gr.jp/cgi-bin/rw-cgi.rb?cmd=view;name=pit
fall
+ ONLINE TUTORIAL/DOC/BOOK: [5]http://www.rubycentral.com/book/
+ VERY USEFUL HINTS:
o “Programming Ruby” book by David Thomas and Andrew Hunt,
“When Trouble Strikes” Chapter, "But It Doesn’t Work"
Section
o “The Ruby Way” book by Hal Fulton, Chapter 1: “Ruby In
Review”

1. Use "ruby -w" instead of simply "ruby" to get helpful warnings. If
   not invoking "ruby" directly, you can set the environment variable
   RUBYOPT to 'w':
      + win32:
        C:\> set RUBYOPT=w
            or
        pressing F5 (to execute) in the Scite editor will give you warnings
        (and F4 will position at problematic line).
      + unix:
        sh# export RUBYOPT="w"
            or
        csh# setenv RUBYOPT "w"
2. The notation "Klass#method" in documentation is used only to
   represent an "instance method" of an object of class Klass; it is
   not a Ruby syntax at all. A "class method" in documentation, on
   the other hand, is usually represented as "Klass.method" (which is
   a valid Ruby syntax).
3. Be aware of the lexical scoping interaction between local
   variables and block local variables. If a local variable is
   already defined before the block, then the block will use (and
   quite possibly modify) the local variable; in this case the block
   does not introduce a new scope. Example:
        (0..2).each do |i|
          puts "inside block: i = #{i}"
        end
        puts "outside block: i = #{i}"    # >> undefined `i'
   On the other hand,
        i = 0
        (0..2).each do |i|
          puts "inside block: i = #{i}"
        end
        puts "outside block: i = #{i}"    # >> 'outside block: i = 2'
   and
        j = 0
        (0..2).each do |i|
          j = i
        end
        puts "outside block: j = #{j}"    # >> 'outside block: j = 2'
4. The String#[Fixnum] method does not return the "character" (which
   is a string of length one) at the Fixnum position, but instead the
   ASCII character code at the position (however, this may change in
   the future). Currently, to get the character itself, use
   String#[Fixnum,1] instead.
   Furthermore, there are additional ASCII conversion methods such as
      + Integer#chr to convert from the ASCII code to the character
        65.chr    # >> "A"
      + ?char to convert from the character to the ASCII code
        ?A    # >> 65
5. In Ruby, there are two sets of logical operators: [!, &&, ||] and
   [not, and, or]. [!, &&, ||]'s precedence is higher than the
   assignments (=, %=, ~=, /=, etc.) while [not, and, or]'s
   precedence is lower. Also note that while &&'s precedence is
   higher than ||'s, the and's precedence is the same as the or's.
6. In the case statement
        case obj
        when obj_1
          ....
        when obj_k
          ....
   it is the "===" method which is invoked, not the "==" method.
   Also, the order is "obj_k === obj" and not "obj === obj_k".
   The reason for this order is so that the case statement can
   "match" obj in more flexible ways. Three interesting cases are
   when obj_k is either a Module/Class, a Regexp, or a Range:
      + The Module/Class class defines the "===" method as a test
        whether obj is an instance of the module/class or its
        descendants ("obj#kind_of? obj_k").
      + The Regexp class defines the "===" method as a test whether
        obj matches the pattern ("obj =~ obj_k").
      + The Range class defines the "===" method as a test whether
        obj is an element of the range ("obj_k.include? obj").
7. Array.new(2, Hash.new) # >> [{}, {}]
   but the two array elements are identical objects, not independent
   hashes. To create an array of (independent) hashes, use the "map"
   or "collect" method:
        arr = (1..2).map {Hash.new}
   Similarly, when creating a hash of arrays, probably the following
   is not the original intention:
        hsh = Hash.new([])
        while line = gets
          if line =~ /(\S+)\s+(\S+)/
            hsh[$1] << $2
          end
        end
        puts hsh.length    # >> 0
   One correct and concise way is to write "(hash[key] ||= []) <<
   value", such as
        hsh = Hash.new
        while line = gets
          if line =~ /(\S+)\s+(\S+)/
            (hsh[$1] ||= []) << $2
          end
        end
8. Be careful when using "mutable" objects as hash keys. To get the
   expected result, call Hash#rehash before accessing the hash
   elements. Example:
        s = "mutable"
        arr = [s]
        hsh = { arr => "object" }
        s.upcase!
        p hsh[arr] # >> nil (maybe not what was expected)
        hsh.rehash
        p hsh[arr] # >> "object"
9. After reading data from a file and putting them into variables,
   the data type is really String. To convert them into numbers, use
   the "to_i" or "to_f" methods. If, for example, you use the "+"
   operator to add the "numbers" without calling the conversion
   methods, you will simply concatenate the strings.
   An alternative is to use "scanf"
   ([6]http://www.rubyhacker.com/code/scanf).
  1. It is advisable not to write some white space before the opening
    ’(’ in a method call; else, Ruby with $VERBOSE set to true may
    give you a warning.

  2. The “dot” for method call is the strongest operator. So for
    example, while in some other languages the number after the dot in
    a floating point number is optional, it is not in Ruby. For
    example, “1.e6” will try to call the method “e6” of the object 1
    (which is a Fixnum). You have to write “1.0e6”.
    However, notice that although the dot is the strongest operator,
    its precedence with respect to method name may be different with
    different Ruby versions. At least in Ruby 1.6.7, “puts
    (1…3).length” will give you a syntax error; you should write
    "puts((1…3).length)" instead.

  3. In Ruby, only false and nil are considered as false in a Boolean
    expression. In particular, 0 (zero), “” or ‘’ (empty string), []
    (empty array), and {} (empty hash) are all considered as true.

  4. Ruby variables hold references to objects and the = operator
    copies the references. Also, a self assignment such as a += b is
    actually translated to a = a + b. Therefore it may be advisable to
    be aware whether in a certain operation you are actually creating
    a new object or modifying an existing one.

  5. There is no standard, built-in deep copy in Ruby. One way to
    achieve a similar effect is by serialization/marshalling. Because
    in Ruby everything is a reference, be careful when you want to
    "copy" objects (such as by using the dup or clone method),
    especially for objects that contain other objects (such as arrays
    and hashes) and when the containment is more than one level deep.

  6. Ruby has no pre/post increment/decrement operator. For instance,
    x++ or x-- will fail to parse. More importantly, ++x or --x will
    do nothing! In fact, they behave as multiple unary prefix
    operators: -x == —x == -----x == …

  7. “0…k” represents a Range object, while “[0…k]” represents an
    array with a single element of type Range. For example, if
    [0…2].each do |i|
    puts "i = #{i}"
    end
    does not give what you expect, probably you should have written
    (0…2).each do |i|
    puts "i = #{i}"
    end
    or
    0.upto(2) do |i|
    puts "i = #{i}"
    end
    instead. Note also that Ruby does not have objects of type “Tuple”
    (which are immutable arrays) and parentheses are usually put
    around a Range object for the purpose of precedence grouping (as
    the “dot” is stronger than the “dot dot” in the above example).

  8. There is some subtle difference between instance variable and
    class variable. For instance variables, the order of creation does
    not matter: they simply “share” the variable. For example:
    class Base
    def initialize; @var = ‘base’; end
    def base_set_var; @var = ‘base’; end
    def base_print_var; puts @var; end
    end

     class Derived < Base
       def initialize;        @var = 'derived'; super; end
       def derived_set_var;   @var = 'derived';        end
       def derived_print_var; puts @var;               end
     end
    
     d = Derived.new
     d.base_set_var;    d.derived_print_var    # >> 'base'
                        d.base_print_var       # >> 'base'
     d.derived_set_var; d.derived_print_var    # >> 'derived'
                        d.base_print_var       # >> 'derived'
    

    But for class variable, the order of creation does matter. In the
    following example, the derived class creates a class variable
    first, with the end result of creation of two distinct class
    variables:
    class Base
    def initialize; @@var = ‘base’; end
    def base_set_var; @@var = ‘base’; end
    def base_print_var; puts @@var; end
    end

     class Derived < Base
       def initialize;        @@var = 'derived'; super; end
       def derived_set_var;   @@var = 'derived';        end
       def derived_print_var; puts @@var;               end
     end
    
     d = Derived.new
     d.base_set_var;    d.derived_print_var    # >> 'derived'
                        d.base_print_var       # >> 'base'
     d.derived_set_var; d.derived_print_var    # >> 'derived'
                        d.base_print_var       # >> 'base'
    

    For the classes in the inheritance chain to share a single class
    variable, the parent has to create the class variable first:
    class Base
    def initialize; @@var = ‘base’; end
    def base_set_var; @@var = ‘base’; end
    def base_print_var; puts @@var; end
    end

     class Derived < Base
       def initialize;        super; @@var = 'derived'; end #changed
       def derived_set_var;   @@var = 'derived';        end
       def derived_print_var; puts @@var;               end
     end
    
     d = Derived.new
     d.base_set_var;    d.derived_print_var    # >> 'base'
                        d.base_print_var       # >> 'base'
     d.derived_set_var; d.derived_print_var    # >> 'derived'
                        d.base_print_var       # >> 'derived'
    

Things That Are Good to Know :slight_smile:

a. In Ruby the "self assignment operator" goes beyond "+=, -=, *=,
   /=, %=". In particular, operators such as "||=" also exist (but
   currently not for a class variable if it is not yet defined; this
   may change in the future). Please see Table 18.4 in the
   "Programming Ruby" book for the complete list.
b. For extensive numerical computations, consider "Numerical Ruby"
   ([7]http://www.ir.isas.ac.jp/~masa/ruby/index-e.html).
c. For (numerical) arrays which consume a large amount of memory and
   CPU time, consider "NArray" which is part of the Numerical Ruby
   ([8]http://www.ir.isas.ac.jp/~masa/ruby/na/SPEC.en).
d. For speeding up some parts of your Ruby code by writing them in C,
   consider "Inline"
   ([9]http://sourceforge.net/projects/rubyinline/).
e. For translation from Ruby to C, consider "rb2c"
   ([10]http://easter.kuee.kyoto-u.ac.jp/~hiwada/ruby/rb2c/).
f. For integration between Ruby and C/C++, consider "SWIG"
   ([11]http://www.swig.org/).
g. For integration between Ruby and Java, consider "JRuby"
   ([12]http://jruby.sourceforge.net/).
h. For integration between Ruby and Lua, consider "Ruby-Lua"
   ([13]http://ruby-lua.unolotiene.com/ruby-lua.whtm).
i. For creating a stand-alone (Windows) executable, consider "exerb"
   ([14]http://exerb.sourceforge.jp/index.en.html).
j. For manipulating raw bits, instead of using Fixnum's, consider
   "BitVector"
   ([15]http://www.ce.chalmers.se/~feldt/ruby/extensions/bitvector/).

 * For comments on this list, you may e-mail me directly at
   [16]billtj@glue.umd.edu.
 _________________________________________________________________

Last updated: Oct 16, 2002.
This list itself is available at
[17]http://www.glue.umd.edu/~billtj/ruby.html.
The plain text format is produced from the HTML format with “lynx
-dump”.
_________________________________________________________________

References

  1. file://localhost/.automount/tulsi/home/tjb/Conf/ruby.txt
  2. http://www.ruby-lang.org/en/
  3. http://www.rubycentral.com/faq/
  4. http://rwiki.jin.gr.jp/cgi-bin/rw-cgi.rb?cmd=view;name=pitfall
  5. http://www.rubycentral.com/book/
  6. http://www.rubyhacker.com/code/scanf
  7. http://www.ir.isas.ac.jp/~masa/ruby/index-e.html
  8. http://www.ir.isas.ac.jp/~masa/ruby/na/SPEC.en
  9. http://sourceforge.net/projects/rubyinline/
  10. http://easter.kuee.kyoto-u.ac.jp/~hiwada/ruby/rb2c/
  11. http://www.swig.org/
  12. http://jruby.sourceforge.net/
  13. http://ruby-lua.unolotiene.com/ruby-lua.whtm
  14. http://exerb.sourceforge.jp/index.en.html
  15. http://www.ce.chalmers.se/~feldt/ruby/extensions/bitvector/
  16. mailto:billtj@glue.umd.edu
  17. http://www.glue.umd.edu/~billtj/ruby.html
  1. Ruby variables hold references to objects and the = operator
    copies the references. Also, a self assignment such as a += b is
    actually translated to a = a + b. Therefore it may be advisable to
    be aware whether in a certain operation you are actually creating
    a new object or modifying an existing one.

a. In Ruby the “self assignment operator” goes beyond “+=, -=, *=,
/=, %=”. In particular, operators such as “||=” also exist (but
currently not for a class variable if it is not yet defined; this
may change in the future). Please see Table 18.4 in the
“Programming Ruby” book for the complete list.

Also notice (performance tip), that
string << “another” is much faster than string+=“another” (no extra
object creation)
so you should use any class-defined update-method, if exists.

Gergo
±[Kontra, Gergely @ Budapest University of Technology and Economics]-+

    Email: kgergely@mcl.hu,  kgergely@turul.eet.bme.hu          |

URL: turul.eet.bme.hu/~kgergely Mobile: (+36 20) 356 9656 |
±------“Olyan langesz vagyok, hogy poroltoval kellene jarnom!”-------+
.
Magyar php mirror es magyar php dokumentacio: http://hu.php.net

Books on Ruby:

Online copy of the Pragmatic Programmer’s Guide available
at http://www.rubycentral.com/book/

Hardcopy also available from Amazon. (4.5 stars / 22 reviews)

Other books available at Amazon:

···

Ruby In A Nutshell by Yukihiro Matsumoto, David L. Reynolds (Translator) (5 / 7)
The Ruby Way by Hal Fulton (5 / 1)
Ruby Developer’s Guide by Robert Feldt, et al (5 / 1)
Sams Teach Yourself Ruby in 21 Days by Mark Slagell (5 / 1)
Making Use of Ruby by Suresh Mahadevan, et al (5 / 2)

Hi,

Based on the comments that I received, I just updated the list
(http://www.glue.umd.edu/~billtj/ruby.html).

Regards,

Bill

···

===========================================================================

               Things That Newcomers to Ruby Should Know

Table of Contents

 * Resources

1. Using warnings
2. Interactive shell
3. On-screen documentation
4. Class#method notation
5. Getting characters from a String
6. Array and Hash default values
7. Mutable Hash keys
8. Reading numerals from a file
9. Pre/Post Increment/Decrement Operators
  1. Lexical scoping in blocks
  2. Two sets of logical operators
  3. The === operator and case statements
  4. White space
  5. The “dot” method call operator
  6. Range objects
  7. Boolean values
  8. Variables, references, and objects
  9. Deep copy
  10. Class variables
  11. Substituting Backslashes
 * Things That Are Good to Know :-)
 _________________________________________________________________

 * Resources:
      + HOME PAGE: http://www.ruby-lang.org/en/
      + FAQ: http://www.rubycentral.com/faq/
      + PITFALL:
        http://rwiki.jin.gr.jp/cgi-bin/rw-cgi.rb?cmd=view;name=pitfal
        l
      + ONLINE TUTORIAL/DOC/BOOK: http://www.rubycentral.com/book/
      + VERY USEFUL HINTS:
           o "Programming Ruby" book by David Thomas and Andrew Hunt,
             "When Trouble Strikes" Chapter, "But It Doesn't Work"
             Section
           o "The Ruby Way" book by Hal Fulton, Chapter 1: "Ruby In
             Review"

1. Use "ruby -w" instead of simply "ruby" to get helpful warnings. If
   not invoking "ruby" directly, you can set the environment variable
   RUBYOPT to 'w':
      + win32:
        C:\> set RUBYOPT=w
            or
        pressing F5 (to execute) in the Scite editor will give you 
        warnings
        (and F4 will position at problematic line).
      + unix:
        sh# export RUBYOPT="w"
            or
        csh# setenv RUBYOPT "w"

2. Ruby has an interactive shell; try to invoke the command "irb"
   instead of "ruby". "irb" is best used for experimenting with the
   language and classes; you may try things out in this environment
   before putting them in your programs.

3. For convenient on-screen Ruby documentation, consider to use (and
   install, if necessary) "ri"
   (http://www.pragmaticprogrammer.com/ruby/downloads/ri.html).
   For example, too see the methods of the File class, run "ri File".
   To read about its open method, type "ri File.open".

4. The notation "Klass#method" in documentation is used only to
   represent an "instance method" of an object of class Klass; it is
   not a Ruby syntax at all. A "class method" in documentation, on
   the other hand, is usually represented as "Klass.method" (which is
   a valid Ruby syntax).

5. The String#[Fixnum] method does not return the "character" (which
   is a string of length one) at the Fixnum position, but instead the
   ASCII character code at the position (however, this may change in
   the future). Currently, to get the character itself, use
   String#[Fixnum,1] instead.
   Furthermore, there are additional ASCII conversion methods such as
      + Integer#chr to convert from the ASCII code to the character
        65.chr    # -> "A"
      + ?chr to convert from the character to the ASCII code
        ?A        # -> 65
   Using these properties, for example, some ways to get the last
   character in a string is by writing "aString[-1, 1]" or
   "aString[-1].chr".

6. Array.new(2, Hash.new) # -> [{}, {}]
   but the two array elements are identical objects, not independent
   hashes. To create an array of (independent) hashes, use the "map"
   or "collect" method:
        arr = (1..2).map {Hash.new}
   Similarly, when creating a hash of arrays, probably the following
   is not the original intention:
        hsh = Hash.new([])
        while line = gets
          if line =~ /(\S+)\s+(\S+)/
            hsh[$1] << $2
          end
        end
        puts hsh.length    # -> 0
   One correct and concise way is to write "(hash[key] ||= []) <<
   value", such as
        hsh = Hash.new
        while line = gets
          if line =~ /(\S+)\s+(\S+)/
            (hsh[$1] ||= []) << $2
          end
        end

7. Be careful when using "mutable" objects as hash keys. To get the
   expected result, call Hash#rehash before accessing the hash
   elements. Example:
        s = "mutable"
        arr = [s]
        hsh = { arr => "object" }
        s.upcase!
        p hsh[arr]    # -> nil (maybe not what was expected)
        hsh.rehash
        p hsh[arr]    # -> "object"

8. After reading data from a file and putting them into variables,
   the data type is really String. To convert them into numbers, use
   the "to_i" or "to_f" methods. If, for example, you use the "+"
   operator to add the "numbers" without calling the conversion
   methods, you will simply concatenate the strings.
   An alternative is to use "scanf"
   (http://www.rubyhacker.com/code/scanf).

9. Ruby has no pre/post increment/decrement operator. For instance,
   x++ or x-- will fail to parse. More importantly, ++x or --x will
   do nothing! In fact, they behave as multiple unary prefix
   operators: -x == ---x == -----x == ......
  1. Beware of the lexical scoping interaction between local variables
    and block local variables. If a local variable is already defined
    before the block, then the block will use (and quite possibly
    modify) the local variable; in this case the block does not
    introduce a new scope. Example:
    (0…2).each do |i|
    puts "inside block: i = #{i}"
    end
    puts “outside block: i = #{i}” # -> undefined `i’
    On the other hand,
    i = 0
    (0…2).each do |i|
    puts "inside block: i = #{i}"
    end
    puts “outside block: i = #{i}” # -> 'outside block: i = 2’
    and
    j = 0
    (0…2).each do |i|
    j = i
    end
    puts “outside block: j = #{j}” # -> ‘outside block: j = 2’

  2. In Ruby, there are two sets of logical operators: [!, &&, ||] and
    [not, and, or]. [!, &&, ||]'s precedence is higher than the
    assignments (=, %=, ~=, /=, etc.) while [not, and, or]'s
    precedence is lower. Also note that while &&'s precedence is
    higher than ||'s, the and’s precedence is the same as the or’s.

  3. In the case statement
    case obj
    when obj_1

    when obj_k

    it is the “===” method which is invoked, not the “==” method.
    Also, the order is “obj_k === obj” and not “obj === obj_k”.
    The reason for this order is so that the case statement can
    "match" obj in more flexible ways. Three interesting cases are
    when obj_k is either a Module/Class, a Regexp, or a Range:

    • The Module/Class class defines the “===” method as a test
      whether obj is an instance of the module/class or its
      descendants (“obj#kind_of? obj_k”).
    • The Regexp class defines the “===” method as a test whether
      obj matches the pattern (“obj =~ obj_k”).
    • The Range class defines the “===” method as a test whether
      obj is an element of the range (“obj_k.include? obj”).
  4. It is advisable not to write some white space before the opening
    ’(’ in a method call; else, Ruby with $VERBOSE set to true may
    give you a warning.

  5. The “dot” for method call is the strongest operator. So for
    example, while in some other languages the number after the dot in
    a floating point number is optional, it is not in Ruby. For
    example, “1.e6” will try to call the method “e6” of the object 1
    (which is a Fixnum). You have to write “1.0e6”.
    However, notice that although the dot is the strongest operator,
    its precedence with respect to method name may be different with
    different Ruby versions. At least in Ruby 1.6.7, “puts
    (1…3).length” will give you a syntax error; you should write
    "puts((1…3).length)" instead.

  6. “0…k” represents a Range object, while “[0…k]” represents an
    array with a single element of type Range. For example, if
    [0…2].each do |i|
    puts "i = #{i}"
    end
    does not give what you expect, probably you should have written
    (0…2).each do |i|
    puts "i = #{i}"
    end
    or
    0.upto(2) do |i|
    puts "i = #{i}“
    end
    instead. Notice also that Ruby does not have objects of type
    "Tuple” (which are immutable arrays) and parentheses are usually
    put around a Range object for the purpose of precedence grouping
    (as the “dot” is stronger than the “dot dot” in the above
    example).

  7. In Ruby, only false and nil are considered as false in a Boolean
    expression. In particular, 0 (zero), “” or ‘’ (empty string), []
    (empty array), and {} (empty hash) are all considered as true.

  8. Ruby variables hold references to objects and the = operator
    copies the references. Also, a self assignment such as a += b is
    actually translated to a = a + b. Therefore it may be advisable to
    be aware whether in a certain operation you are actually creating
    a new object or modifying an existing one.
    For example, string << “another” is faster than string +=
    “another” (no extra object creation), so you would be better off
    using any class-defined update-method (if that is really your
    intention), if it exists.

  9. There is no standard, built-in deep copy in Ruby. One way to
    achieve a similar effect is by serialization/marshalling. Because
    in Ruby everything is a reference, be careful when you want to
    "copy" objects (such as by using the dup or clone method),
    especially for objects that contain other objects (such as arrays
    and hashes) and when the containment is more than one level deep.

  10. A class variable is in general per-hierarchy, not per-class (i.e.,
    a class variable is “shared” by a parent and all of its
    descendants, in addition to being shared by all instances of that
    class). One subtle exception is if a child class creates a class
    variable before its parent does. For example, when a parent
    creates a class variable first:
    class Base
    def initialize; @@var = ‘base’; end
    def base_set_var; @@var = ‘base’; end
    def base_print_var; puts @@var; end
    end

     class Derived < Base
       def initialize;        super; @@var = 'derived'; end #notice
       def derived_set_var;   @@var = 'derived';        end
       def derived_print_var; puts @@var;               end
     end
    
     d = Derived.new
     d.base_set_var;    d.derived_print_var    # -> 'base'
                        d.base_print_var       # -> 'base'
     d.derived_set_var; d.derived_print_var    # -> 'derived'
                        d.base_print_var       # -> 'derived'
    

    In the above code, the class variable @@var is indeed “shared” by
    the Base and Derived classes. However, now see what happens when a
    child class creates the variable first:
    class Base
    def initialize; @@var = ‘base’; end
    def base_set_var; @@var = ‘base’; end
    def base_print_var; puts @@var; end
    end

     class Derived < Base
       def initialize;        @@var = 'derived'; super; end #changed
       def derived_set_var;   @@var = 'derived';        end
       def derived_print_var; puts @@var;               end
     end
    
     d = Derived.new
     d.base_set_var;    d.derived_print_var    # -> 'derived'
                        d.base_print_var       # -> 'base'
     d.derived_set_var; d.derived_print_var    # -> 'derived'
                        d.base_print_var       # -> 'base'
    

    In this case, the parent and child classes have two independent
    class variables with identical names.

  11. Substituting backslashes may be tricky. Example:
    str = ‘a\b\c’ # -> a\b\c
    puts str.gsub(/\/,’\\’) # -> a\b\c
    puts str.gsub(/\/,’\\\’) # -> a\b\c
    puts str.gsub(/\/,’\\\\’) # -> a\b\c
    puts str.gsub(/\/) { ‘\\’ } # -> a\b\c
    puts str.gsub(/\/, ‘&&’) # -> a\b\c

Things That Are Good to Know :slight_smile:

a. In Ruby the "self assignment operator" goes beyond "+=, -=, *=,
   /=, %=". In particular, operators such as "||=" also exist (but
   currently not for a class variable if it is not yet defined; this
   may change in the future). Please see Table 18.4 in the
   "Programming Ruby" book for the complete list.

b. For a "cookbook" with many algorithm and code examples, consider
   "PLEAC-Ruby" (http://pleac.sourceforge.net/pleac_ruby/t1.html).

c. For extensive numerical computations, consider "Numerical Ruby"
   (http://www.ir.isas.ac.jp/~masa/ruby/index-e.html).

d. For (numerical) arrays which consume a large amount of memory
   and/or CPU time, consider "NArray" which is part of the Numerical
   Ruby (http://www.ir.isas.ac.jp/~masa/ruby/na/SPEC.en).

e. For speeding up some parts of your Ruby code by writing them in C,
   consider "Inline" (http://sourceforge.net/projects/rubyinline/).

f. For Ruby to C translation, consider "rb2c"
   (http://easter.kuee.kyoto-u.ac.jp/~hiwada/ruby/rb2c/).

g. For Ruby and C/C++ integration, consider "SWIG"
   (http://www.swig.org/).

h. For Ruby and Java integration, consider "JRuby"
   (http://jruby.sourceforge.net/).

i. For Ruby and Lua integration, consider "Ruby-Lua"
   (http://ruby-lua.unolotiene.com/ruby-lua.whtm).

j. For creating a stand-alone (Windows) executable, consider "exerb"
   (http://exerb.sourceforge.jp/index.en.html).

k. For manipulating raw bits, instead of using Fixnum's, consider
   "BitVector"
   (http://www.ce.chalmers.se/~feldt/ruby/extensions/bitvector/).

 * For comments on this list, you may e-mail me directly at
   billtj@glue.umd.edu.
 _________________________________________________________________

Last updated: Oct 21, 2002.
This list itself is available at
http://www.glue.umd.edu/~billtj/ruby.html.
The plain text format is produced from the HTML format with “lynx
-dump -nolist” (and some minor editing).
_________________________________________________________________

Hi,

Thanks. Although I think performance-related issues should be in another
list, I will include it as it is not too long.

Regards,

Bill

···

==========================================================================
Kontra, Gergely kgergely@mlabdial.hit.bme.hu wrote:

Also notice (performance tip), that
string << “another” is much faster than string+=“another” (no extra
object creation)
so you should use any class-defined update-method, if exists.

Hmm, why does ‘string += another’ need to create a new object? string
already exists, so wouldn’t the reference simply be updated?

Ian

···

On Sat 19 Oct 2002 at 00:42:20 +0900, Kontra, Gergely wrote:

  1. Ruby variables hold references to objects and the = operator
    copies the references. Also, a self assignment such as a += b is
    actually translated to a = a + b. Therefore it may be advisable to
    be aware whether in a certain operation you are actually creating
    a new object or modifying an existing one.

a. In Ruby the “self assignment operator” goes beyond “+=, -=, *=,
/=, %=”. In particular, operators such as “||=” also exist (but
currently not for a class variable if it is not yet defined; this
may change in the future). Please see Table 18.4 in the
“Programming Ruby” book for the complete list.

Also notice (performance tip), that
string << “another” is much faster than string+=“another” (no extra
object creation)
so you should use any class-defined update-method, if exists.


Ian Macdonald | Q: What do you call a monk who has had a
ian@caliban.org | sex change operation? A: A transsister.
>
>
>

It’s a feature. :slight_smile: The notation a+=b really, REALLY
means a = a + b. There is a + operator, but never a
+= operator, in Ruby (or any other assignment).

Hal

···

----- Original Message -----
From: “Ian Macdonald” ian@caliban.org
To: “ruby-talk ML” ruby-talk@ruby-lang.org
Sent: Monday, October 21, 2002 10:03 PM
Subject: Re: Things That Newcomers to Ruby Should Know (10/16/02)

Hmm, why does ‘string += another’ need to create a new object? string
already exists, so wouldn’t the reference simply be updated?

I understand that, but since ‘a’ already exists prior to assigning a + b
to it, why does it need to be recreated? Where strings are concerned,
I don’t see why this would be any different to ‘a << b’.

I don’t doubt for a minute that I’m being told the truth here; I just
fail to understand why this way of doing things is necessarily less
efficient.

Ian

···

On Tue 22 Oct 2002 at 12:18:22 +0900, Hal E. Fulton wrote:

----- Original Message -----
From: “Ian Macdonald” ian@caliban.org
To: “ruby-talk ML” ruby-talk@ruby-lang.org
Sent: Monday, October 21, 2002 10:03 PM
Subject: Re: Things That Newcomers to Ruby Should Know (10/16/02)

Hmm, why does ‘string += another’ need to create a new object? string
already exists, so wouldn’t the reference simply be updated?

It’s a feature. :slight_smile: The notation a+=b really, REALLY
means a = a + b.


Ian Macdonald | Cocaine: The thinking man’s Dristan.
ian@caliban.org |
>
>
>

From: Ian Macdonald [mailto:ian@caliban.org]
Sent: Monday, October 21, 2002 11:30 PM
To: ruby-talk ML
Subject: Re: Things That Newcomers to Ruby Should Know (10/16/02)

From: “Ian Macdonald” ian@caliban.org
To: “ruby-talk ML” ruby-talk@ruby-lang.org
Sent: Monday, October 21, 2002 10:03 PM
Subject: Re: Things That Newcomers to Ruby Should Know (10/16/02)

Hmm, why does ‘string += another’ need to create a new object?
string already exists, so wouldn’t the reference simply
be updated?

It’s a feature. :slight_smile: The notation a+=b really, REALLY
means a = a + b.

I understand that, but since ‘a’ already exists prior to
assigning a + b to it, why does it need to be recreated?
Where strings are concerned, I don’t see why this would be
any different to ‘a << b’.

‘a’ is a variable…not an object.

With a << b you are invoking the method ‘<<’ on the object pointed to by
‘a’ with the parameter ‘b’…you are not changing the variable’s
reference.

With a = a + b you are assigning the variable ‘a’ to the return value of
the ‘+’ method on the object ‘a’ points to (passing the parameter ‘b’).
And a += b is just syntactic sugar for a = a + b.

-rich

···

-----Original Message-----
On Tue 22 Oct 2002 at 12:18:22 +0900, Hal E. Fulton wrote:

----- Original Message -----

I don’t doubt for a minute that I’m being told the truth
here; I just fail to understand why this way of doing things
is necessarily less efficient.

Ian

Ian Macdonald | Cocaine: The thinking man’s Dristan.
ian@caliban.org |
>
>
>

Look at it this way: The expression a + b yields a
new object, does it not? You wouldn’t want a + b
simply to append b onto a, would you?

The new object which we describe as a + b has no way
of knowing that it’s about to be assigned back to
variable a.

It might conceptually be possible to have the
interpreter “optimize” a+=b into a<<b (for certain
classes), but I wouldn’t favor that. There could
be consequences, I’d think.

Hal

···

----- Original Message -----
From: “Ian Macdonald” ian@caliban.org
To: “ruby-talk ML” ruby-talk@ruby-lang.org
Sent: Monday, October 21, 2002 10:30 PM
Subject: Re: Things That Newcomers to Ruby Should Know (10/16/02)

On Tue 22 Oct 2002 at 12:18:22 +0900, Hal E. Fulton wrote:

----- Original Message -----
From: “Ian Macdonald” ian@caliban.org
To: “ruby-talk ML” ruby-talk@ruby-lang.org
Sent: Monday, October 21, 2002 10:03 PM
Subject: Re: Things That Newcomers to Ruby Should Know (10/16/02)

Hmm, why does ‘string += another’ need to create a new object? string
already exists, so wouldn’t the reference simply be updated?

It’s a feature. :slight_smile: The notation a+=b really, REALLY
means a = a + b.

I understand that, but since ‘a’ already exists prior to assigning a + b
to it, why does it need to be recreated? Where strings are concerned,
I don’t see why this would be any different to ‘a << b’.

Hi Ian,

Based on previous responses, I think a standard example answer to your
question is something like this:

a = 'aString'
c = a
a += ' modified using = +'
puts c    # -> "aString"

a = 'aString'
c = a
a << ' modified using <<'
puts c    # -> "aString modified using <<"

Because in Ruby the “=” operator only copies the reference (but not the
object), based on the above example, probably you can tell which is more
natural: for the “a += b” to behave like “a = a + b” or to behave like “a
<< b”.

If you know C++, then you know that Ruby has made a particular
choice: Ruby does not overload the assignment operator and Ruby does not
treat the “+=” operator as independent from the corresponding
“=” operator. Although there are always exceptions, I think so far Ruby’s
philosophy in this regard is the most natural.

Furthermore, Ruby still gives you a choice for a String as in the example
above. If your intention is to create a new object (as not to affect c),
you want to use “+=”; if your intention is really to modify the object
referred to by a, you want to use “<<” (but be careful that other
variables such as c are also affected).

Regards,

Bill

···

==========================================================================
Ian Macdonald ian@caliban.org wrote:

I understand that, but since ‘a’ already exists prior to assigning a + b
to it, why does it need to be recreated? Where strings are concerned,
I don’t see why this would be any different to ‘a << b’.

Hello Ian,

Tuesday, October 22, 2002, 7:30:19 AM, you wrote:

I understand that, but since ‘a’ already exists prior to assigning a + b
to it, why does it need to be recreated? Where strings are concerned,
I don’t see why this would be any different to ‘a << b’.

main problem is that object pointed by ‘a’ mauy be also pointed by
another variable an if you change this object directly your code wil
have SIDE EFFECT. are you really need this? :slight_smile: so ‘<<’ and other
methods changing object itself must be used very carefully, preferably
for local variables and other “own” objects

example of this side effect:

def a(s)
s << ‘!’
s += ‘?’
return s
end

x=‘1’
y=a(x)
p x,y

···


Best regards,
Bulat mailto:bulatz@integ.ru

Would it be possible for Ruby to know whether an object is referenced by
several variables so += does ‘<<’ or ‘… = … + …’ when appropriate?

Perhaps a bit in each String indicating whether it is shared or
not would make it… and GC issues could be ignored at first.

···

On Wed, Oct 23, 2002 at 06:28:29PM +0900, Bulat Ziganshin wrote:

Hello Ian,

Tuesday, October 22, 2002, 7:30:19 AM, you wrote:

I understand that, but since ‘a’ already exists prior to assigning a + b
to it, why does it need to be recreated? Where strings are concerned,
I don’t see why this would be any different to ‘a << b’.

main problem is that object pointed by ‘a’ mauy be also pointed by
another variable an if you change this object directly your code wil
have SIDE EFFECT. are you really need this? :slight_smile: so ‘<<’ and other
methods changing object itself must be used very carefully, preferably
for local variables and other “own” objects

example of this side effect:

def a(s)
s << ‘!’
s += ‘?’
return s
end

x=‘1’
y=a(x)
p x,y


_ _

__ __ | | ___ _ __ ___ __ _ _ __
'_ \ / | __/ __| '_ _ \ / ` | ’ \
) | (| | |
__ \ | | | | | (| | | | |
.__/ _,
|_|/| || ||_,|| |_|
Running Debian GNU/Linux Sid (unstable)
batsman dot geo at yahoo dot com

The most important design issue… is the fact that Linux is supposed to
be fun…
– Linus Torvalds at the First Dutch International Symposium on Linux

Thanks, William. Your example is both succinct and very clear.

I’ve only been programming in Ruby for about nine months, so things
like this still catch me out from time to time. On more than one
occasion, I’ve been surprised by my variable changing value, simply
because another pointer to the same variable was used to change its
value.

I’d suggest you put the above code snippet in your ‘Things That
Newcomers to Ruby Should Know’ document, if it’s not already there.

Thanks to Hal, too, for his explanation.

Ian

···

On Tue 22 Oct 2002 at 22:16:50 +0900, William Djaja Tjokroaminata wrote:

Based on previous responses, I think a standard example answer to your
question is something like this:

a = 'aString'
c = a
a += ' modified using = +'
puts c    # -> "aString"

a = 'aString'
c = a
a << ' modified using <<'
puts c    # -> "aString modified using <<"

Because in Ruby the “=” operator only copies the reference (but not the
object), based on the above example, probably you can tell which is more
natural: for the “a += b” to behave like “a = a + b” or to behave like “a
<< b”.


Ian Macdonald | In Lexington, Kentucky, it’s illegal to
ian@caliban.org | carry an ice cream cone in your pocket.
>
>
>

Hello Mauricio,

Wednesday, October 23, 2002, 4:09:02 PM, you wrote:

def a(s)
s << ‘!’
s += ‘?’
return s
end

Would it be possible for Ruby to know whether an object is referenced by
several variables so += does ‘<<’ or ‘… = … + …’ when appropriate?

Perhaps a bit in each String indicating whether it is shared or
not would make it… and GC issues could be ignored at first.

simple things must remain simple :slight_smile: instead of counting references
and checking reference count at each ‘+=’ ruby just creates new
objects ang g.c. old and unreferenced

···


Best regards,
Bulat mailto:bulatz@integ.ru

Hello Ian,

Tuesday, October 22, 2002, 7:30:19 AM, you wrote:

I understand that, but since ‘a’ already exists prior to assigning a +
b
to it, why does it need to be recreated? Where strings are concerned,
I don’t see why this would be any different to ‘a << b’.

main problem is that object pointed by ‘a’ mauy be also pointed by
another variable an if you change this object directly your code wil
have SIDE EFFECT. are you really need this? :slight_smile: so ‘<<’ and other
methods changing object itself must be used very carefully, preferably
for local variables and other “own” objects

example of this side effect:

def a(s)
s << ‘!’
s += ‘?’
return s
end

x=‘1’
y=a(x)
p x,y

Would it be possible for Ruby to know whether an object is referenced by
several variables so += does ‘<<’ or ‘… = … + …’ when appropriate?

Perhaps a bit in each String indicating whether it is shared or
not would make it… and GC issues could be ignored at first.

I don’t think so. += is syntax sugar for you-know-what. Imagine trying to
explain to a newbie “… but except in this case …”

Side-effects are a part of life in Ruby (and not only in Ruby). Programmers
have to deal with them. They’re not even a bad thing.

Gavin

···

From: “Mauricio Fernández” batsman.geo@yahoo.com

On Wed, Oct 23, 2002 at 06:28:29PM +0900, Bulat Ziganshin wrote:

Hi Ian,

I will put it in the list.

Regards,

Bill

···

===========================================================================
Ian Macdonald ian@caliban.org wrote:

I’ve only been programming in Ruby for about nine months, so things
like this still catch me out from time to time. On more than one
occasion, I’ve been surprised by my variable changing value, simply
because another pointer to the same variable was used to change its
value.

I’d suggest you put the above code snippet in your ‘Things That
Newcomers to Ruby Should Know’ document, if it’s not already there.

[Disclaimer: I’m not proposing a modification of the language, see [52838]
Even though there’s a subtle change in the semantics, no code should
break, and this is only an optimization, and nothing else]

a = “bla bla”
a += " — wtf?"
p a => “bla bla — wft?”

In this stupid example, we sure don’t care whether Ruby is creating a
new object or not, and the programmer doesn’t mind having Ruby do
a << " — wtf?"
instead of
a = a + " — wtf?"

Now we certainly don’t want Ruby to use ‘<<’ internally in
b = a
b += " the object pointed to by a should not change!"

I consider the use of ‘<<’ when possible to be nothing but a slight
optimization which might (or not) prove to make Ruby faster.
You can do a “incomplete RC” (in fact I think something like that was
already used in Ruby, somewhere), for the common case where an object is
referenced only once. If another variable references it, set a “shared
bit” to 1, and that’s it. so you don’t have to do complete RC which
would be expensive.

So we’d have
a = “a”
b = “b”
a += “a” => a == “aa”, using ‘<<’
b = a => b == “aa”
b += “b” => b == “aab”, used ’ = … + …"

simple RC in action now

a += “a” => a == “aaa”, using " = … + …"

cause the ‘shared bit’ wasn’t reset, as we’re not doing real RC

This is not a language modification (unless somebody really needs a new
object to be created always when doing +=), only a (possible) speed
optimization.

PS: I’m getting a computer of my own next week, and as I’m gonna have a
lot of free time, I may try this and other different tricks on Ruby’s
implementation, following my new policy on how to improve Ruby [52838]
http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/52838

···

On Wed, Oct 23, 2002 at 09:35:34PM +0900, Gavin Sinclair wrote:

Would it be possible for Ruby to know whether an object is referenced by
several variables so += does ‘<<’ or ‘… = … + …’ when appropriate?

Perhaps a bit in each String indicating whether it is shared or
not would make it… and GC issues could be ignored at first.

I don’t think so. += is syntax sugar for you-know-what. Imagine trying to
explain to a newbie “… but except in this case …”

Side-effects are a part of life in Ruby (and not only in Ruby). Programmers
have to deal with them. They’re not even a bad thing.


_ _

__ __ | | ___ _ __ ___ __ _ _ __
'_ \ / | __/ __| '_ _ \ / ` | ’ \
) | (| | |
__ \ | | | | | (| | | | |
.__/ _,
|_|/| || ||_,|| |_|
Running Debian GNU/Linux Sid (unstable)
batsman dot geo at yahoo dot com

Stupid nick highlighting
Whenever someone starts with “stupid” it highlights the nick. Hmm.
#Debian

Yeah, KISS applies always, but I may anyway try to implement this in the
future for the following reasons:

  • the modification seems easy enough
  • I will possibly learn more about Ruby’s internals in the process
  • if it’s simple and makes Ruby faster, what can be wrong about it?

If I consider there’s a limited resource (developer time/skills) and
something to make (a better Ruby implementation), it would of course
make sense for me to spend time on other implementation issues; but as
this is as much a “ruby internals exercise” as an actual usable thing,
it might end up improving my skills, and actually increasing the amount
of available resources (time x skill), and henceforth my output in the
midterm :-)!

···

On Wed, Oct 23, 2002 at 09:34:35PM +0900, Bulat Ziganshin wrote:

Hello Mauricio,

Wednesday, October 23, 2002, 4:09:02 PM, you wrote:

def a(s)
s << ‘!’
s += ‘?’
return s
end

Would it be possible for Ruby to know whether an object is referenced by
several variables so += does ‘<<’ or ‘… = … + …’ when appropriate?

Perhaps a bit in each String indicating whether it is shared or
not would make it… and GC issues could be ignored at first.

simple things must remain simple :slight_smile: instead of counting references
and checking reference count at each ‘+=’ ruby just creates new
objects ang g.c. old and unreferenced


_ _

__ __ | | ___ _ __ ___ __ _ _ __
'_ \ / | __/ __| '_ _ \ / ` | ’ \
) | (| | |
__ \ | | | | | (| | | | |
.__/ _,
|_|/| || ||_,|| |_|
Running Debian GNU/Linux Sid (unstable)
batsman dot geo at yahoo dot com

Hah! we have 2 Johnie Ingrams in the channel :slight_smile:
Hey all btw :slight_smile:

Hi batsman,

I consider the use of ‘<<’ when possible to be nothing but a slight
optimization which might (or not) prove to make Ruby faster.
You can do a “incomplete RC” (in fact I think something like that was
already used in Ruby, somewhere), for the common case where an object is
referenced only once. If another variable references it, set a “shared
bit” to 1, and that’s it. so you don’t have to do complete RC which
would be expensive.

If you do “incomplete RC”, basically you have to find all the places where
a variable (or another object) may start referring to an object. Do you
have any thought of assessing a “complete RC” implementation? I know that
the amount of work by going from “no RC” to “complete RC” is more than
twice of going from “no RC” to “incomplete RC” as currently Ruby uses
mark & sweep (M&S) gc (and therefore there exist a bunch of
“new” functions but not a complete set of “delete” functions.)

Well, Python started with RC and then added M&S. Ruby started with M&S
and now the RC is being considered. I don’t know which direction is
harder or both are equally as hard. I am just thinking that it will
be very nice if we can select the Ruby gc mode.

PS: I’m getting a computer of my own next week, and as I’m gonna have a
lot of free time, I may try this and other different tricks on Ruby’s
implementation, following my new policy on how to improve Ruby [52838]
http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/52838

Good luck on your quest and have fun with your new computer.

Regards,

Bill

···

Mauricio Fern?ndez batsman.geo@yahoo.com wrote: