[QUIZ] Twisting a Rope (#137)

The three rules of Ruby Quiz:

1. Please do not post any solutions or spoiler discussion for this quiz until
48 hours have passed from the time on this message.

2. Support Ruby Quiz by submitting ideas as often as you can:

http://www.rubyquiz.com/

3. Enjoy!

Suggestion: A [QUIZ] in the subject of emails about the problem helps everyone
on Ruby Talk follow the discussion. Please reply to the original quiz message,
if you can.

···

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

by John Miller

This week's task is to implement the Rope data structure as a Ruby class. This
topic comes out of the ICFP programming competition
(http://www.icfpcontest.com/) which had competitors manipulating a 7.5 million
character string this year.

  What is a Rope:

You may not realize it, but for many changes the content to a String, Ruby
creates a new copy of the original with the modifications applied. For small
strings that are created once and read often this is actually a very efficient
way to do thing, but what happens when the string starts to get long, and is
undergoing a lot of changes? First, the program will spend more and more of its
processing cycles just copying bits around. Second, the garbage collector will
be called more and more often to pick up the little stringy scraps you've left
all over the memory.

Ropes (the name is a pun for a heavy duty string) are tree structures where a
node represents the concatenation of its left branch with its right, and leaves
are flat strings. (This is a little sloppy. A rope may contain shared subtrees,
and is thus really a directed acyclic graph, where the out-edges of each vertex
are ordered. We will continue to be sloppy.) E.g. To prepend Text A to Text B,
one creates a Node (call it N1) with A as its left branch and B as its right.
To further append Text C create a new Node N2 with its left branch pointing to
N1 and its right to C. Easy, right? To find out more see Boehm, Atkinson and
Plass "Ropes: an Alternative to Strings" at:

  http://rubyurl.com/2FRbO

The task comes in three parts, each increasing in difficulty:

Part one:

Create a Rope class that can do the following:

  a. 'append' or 'prepend' a String or another Rope
     (alias the << operator to the append function)
  b. Return the fully concatenated text with 'to_s'
  c. define 'slice' to call to_s.slice
  d. provide a 'length' method

Part two:

Add the following:

  a. Add the ability to 'index' a single character given a 0-based offset
     from the beginning of the string.
  b. Add the ability to 'shift' a single character from the front of a Rope.
     (Remove and return the character)
  c. Add your own 'slice' method that returns a String. Implement as many of
     the String method's forms as possible. To run the example code this
     function will only need to understand the slice(offset,length) form.
     Major Bonus for Regex and Substring forms.
  d. "Balance" the tree with a 'normalize' method.
     (see Boehm, Atkinson and Plass 1319 Rebalancing)

Part three: (bonus)

Add the following:

  a. Change the 'slice' method to return a Rope. Ideally this method should
     do as little string copying as possible. (Doing this will well
     dramatically increase the speed of the example code)
  b. Allow 'shift' to optionally accept an integer number of characters to
     remove and return.
  c. modify the '<<' operator so that can efficiently append a few
     characters at a time. (see Boehm, Atkinson and Plass 1318 para. 4)
  d. *Major Bonus* Add the ability to append and prepend IO classes in a
     lazy fashion. (see Boehm, Atkinson and Plass 1318 para. 2)

The following code may help you explore how efficient your code is and show
where Ropes are useful. `ruby -r /path/to/your/rope/class this_script.rb Rope`
will run the test with your code. Run the script without arguments to see how
well String does. Also play around with the SIZE and CHUNKS constants to get a
feel for how they affect performance.

  require 'benchmark'
  
  #This code make a String/Rope of CHUNCKS chunks of text
  #each chunck is SIZE bytes long. Each chunck starts with
  #an 8 byte number. Initially the chuncks are shuffled the
  #qsort method sorts them into ascending order.
  #
  #pass the name of the class to use as a parameter
  #ruby -r rope.rb this_file Rope
  
  puts 'preparing data...'
  TextClass = Object.const_get(ARGV.shift || :String)
  
  def qsort(text)
    return TextClass.new if text.length == 0
    pivot = text.slice(0,8).to_s.to_i
    less = TextClass.new
    more = TextClass.new
    offset = 8+SIZE
    while (offset < text.length)
      i = text.slice(offset,8).to_s.to_i
      (i < pivot ? less : more) << text.slice(offset,8+SIZE)
      offset = offset + 8+SIZE
    end
    print "*"
    return qsort(less) << text.slice(0,8+SIZE) << qsort(more)
  end
  
  SIZE = 512 * 1024
  CHUNCKS = 128
  CHARS = %w[R O P E]
  data = TextClass.new
  bulk_string =
    TextClass.new(Array.new(SIZE) { CHARS[rand(4)] }.join)
  puts 'Building Text...'
  build = Benchmark.measure do
    (0..CHUNCKS).sort_by { rand }.each do |n|
      data<< sprintf("%08i",n) << bulk_string
    end
    data.normalize if data.respond_to? :normalize
  end
  GC.start
  sort = Benchmark.measure do
    puts "Sorting Text..."
    qsort(data)
    puts"\nEND"
  end
  
  puts "Build: #{build}Sort: #{sort}"

My apologies,

The first line of the quiz should read:
"You may not realize it, but for many changes to the content of a
String,"

I am really looking forward to seeing some ingenious solutions to this
problem. The concept seemed very simple to me but my first attempt kept
becoming mired down in edge cases. I hope everyone has a good weekend
and a good Labor Day weekend to those in the US (plenty of time to work
on this weeks quiz :wink:

John Miller

···

--
Posted via http://www.ruby-forum.com/.

I've done parts one and two so far. I'll try to add more in the next
few days.

For simplicity and speed, append and prepend don't modify the
receiver.

If anyone has any questions about my code, I'll be happy to answer.

Carl

require "rubygems"
require "facets"
require "kernel/with"
require "symbol/to_proc"

class String
  def shift
    return nil if empty?
    returning self[0].chr do
      self[0] = ""
    end
  end
end

class Rope
  attr_reader :left, :right, :length

  def Rope.new(*args)
    if args.size <= 2
      super
    else # build balanced tree
      mid = args.size / 2
      args[mid,2] = Rope.new(*args[mid,2])
      Rope.new(*args)
    end
  end

  def initialize(left="",right="")
    @left, @right = left, right
    @length = @left.length + @right.length
  end

  def replace(rope)
    initialize(rope.left,rope.right)
    self
  end

  # clean out empty strings and rebuild tree
  def normalize
    Rope.new(*flat_strings.reject(&:empty?))
  end

  def to_s
    flat_strings.join('')
  end

  def append(str)
    Rope.new(self,str)
  end
  alias_method :<<, :append

  def prepend(str)
    Rope.new(str,self)
  end

  def slice(start,length=@length-start)
    if start > left.length # right
      right.slice(start-left.length,length-left.length)
    elsif left.length < length # left and right
      left.slice(start,left.length) +
      right.slice(start-left.length,length-left.length)
    else # left
      left.slice(start,length)
    end
  end

  def shift
    if letter = left.shift || right.shift
      @length -= 1
      letter
    else
      nil
    end
  end

  def index(letter,start=0)
    if start > left.length # right
      left.length + right.index(letter,start - left.length)
    else # left
      left.index(letter,start) || left.length + right.index(letter)
    end
  rescue
    nil
  end

  # TODO implement rest of methods
  def method_missing(method, *args, &block)
    result = to_s.send(method, *args, &block)
    if result.is_a?(String)
      if method.to_s =~ /!$/
        replace(Rope.new(result))
      else
        Rope.new(result)
      end
    else
      result
    end
  end

protected

  # traverse the tree
  def traverse(&block)
    @left.is_a?(Rope) ? @left.traverse(&block) : block.call(@left)
    @right.is_a?(Rope) ? @right.traverse(&block) : block.call(@right)
  end

  # collect all the flat strings
  def flat_strings
    returning do |ary|
      traverse { |str| ary << str }
    end
  end

end

···

On Aug 31, 6:19 am, Ruby Quiz <ja...@grayproductions.net> wrote:

The three rules of Ruby Quiz:

1. Please do not post any solutions or spoiler discussion for this quiz until
48 hours have passed from the time on this message.

2. Support Ruby Quiz by submitting ideas as often as you can:

http://www.rubyquiz.com/

3. Enjoy!

Suggestion: A [QUIZ] in the subject of emails about the problem helps everyone
on Ruby Talk follow the discussion. Please reply to the original quiz message,
if you can.

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- =-=-=

by John Miller

This week's task is to implement the Rope data structure as a Ruby class. This
topic comes out of the ICFP programming competition
(http://www.icfpcontest.com/\) which had competitors manipulating a 7.5 million
character string this year.

        What is a Rope:

You may not realize it, but for many changes the content to a String, Ruby
creates a new copy of the original with the modifications applied. For small
strings that are created once and read often this is actually a very efficient
way to do thing, but what happens when the string starts to get long, and is
undergoing a lot of changes? First, the program will spend more and more of its
processing cycles just copying bits around. Second, the garbage collector will
be called more and more often to pick up the little stringy scraps you've left
all over the memory.

Ropes (the name is a pun for a heavy duty string) are tree structures where a
node represents the concatenation of its left branch with its right, and leaves
are flat strings. (This is a little sloppy. A rope may contain shared subtrees,
and is thus really a directed acyclic graph, where the out-edges of each vertex
are ordered. We will continue to be sloppy.) E.g. To prepend Text A to Text B,
one creates a Node (call it N1) with A as its left branch and B as its right.
To further append Text C create a new Node N2 with its left branch pointing to
N1 and its right to C. Easy, right? To find out more see Boehm, Atkinson and
Plass "Ropes: an Alternative to Strings" at:

       http://rubyurl.com/2FRbO

The task comes in three parts, each increasing in difficulty:

Part one:

Create a Rope class that can do the following:

        a. 'append' or 'prepend' a String or another Rope
           (alias the << operator to the append function)
        b. Return the fully concatenated text with 'to_s'
        c. define 'slice' to call to_s.slice
        d. provide a 'length' method

Part two:

Add the following:

        a. Add the ability to 'index' a single character given a 0-based offset
           from the beginning of the string.
        b. Add the ability to 'shift' a single character from the front of a Rope.
           (Remove and return the character)
        c. Add your own 'slice' method that returns a String. Implement as many of
           the String method's forms as possible. To run the example code this
           function will only need to understand the slice(offset,length) form.
           Major Bonus for Regex and Substring forms.
        d. "Balance" the tree with a 'normalize' method.
           (see Boehm, Atkinson and Plass 1319 Rebalancing)

Part three: (bonus)

Add the following:

        a. Change the 'slice' method to return a Rope. Ideally this method should
           do as little string copying as possible. (Doing this will well
           dramatically increase the speed of the example code)
        b. Allow 'shift' to optionally accept an integer number of characters to
           remove and return.
        c. modify the '<<' operator so that can efficiently append a few
           characters at a time. (see Boehm, Atkinson and Plass 1318 para. 4)
        d. *Major Bonus* Add the ability to append and prepend IO classes in a
           lazy fashion. (see Boehm, Atkinson and Plass 1318 para. 2)

The following code may help you explore how efficient your code is and show
where Ropes are useful. `ruby -r /path/to/your/rope/class this_script.rb Rope`
will run the test with your code. Run the script without arguments to see how
well String does. Also play around with the SIZE and CHUNKS constants to get a
feel for how they affect performance.

        require 'benchmark'

        #This code make a String/Rope of CHUNCKS chunks of text
        #each chunck is SIZE bytes long. Each chunck starts with
        #an 8 byte number. Initially the chuncks are shuffled the
        #qsort method sorts them into ascending order.
        #
        #pass the name of the class to use as a parameter
        #ruby -r rope.rb this_file Rope

        puts 'preparing data...'
        TextClass = Object.const_get(ARGV.shift || :String)

        def qsort(text)
          return TextClass.new if text.length == 0
          pivot = text.slice(0,8).to_s.to_i
          less = TextClass.new
          more = TextClass.new
          offset = 8+SIZE
          while (offset < text.length)
            i = text.slice(offset,8).to_s.to_i
            (i < pivot ? less : more) << text.slice(offset,8+SIZE)
            offset = offset + 8+SIZE
          end
          print "*"
          return qsort(less) << text.slice(0,8+SIZE) << qsort(more)
        end

        SIZE = 512 * 1024
        CHUNCKS = 128
        CHARS = %w[R O P E]
        data = TextClass.new
        bulk_string =
          TextClass.new(Array.new(SIZE) { CHARS[rand(4)] }.join)
        puts 'Building Text...'
        build = Benchmark.measure do
          (0..CHUNCKS).sort_by { rand }.each do |n|
            data<< sprintf("%08i",n) << bulk_string
          end
          data.normalize if data.respond_to? :normalize
        end
        GC.start
        sort = Benchmark.measure do
          puts "Sorting Text..."
          qsort(data)
          puts"\nEND"
        end

        puts "Build: #{build}Sort: #{sort}"

My implementation is attached along with a modified test. I made a
few changes to what was asked, but this also gave more flexibility.
Here is what I did:

* Just as the implementation in the paper referenced in this quiz, my
implementation gives immutable ropes. #<< is non-destructive, so I
had to change the test to do <<= instead of just <<. Also, I didn't
implement #shift, because it must be destructive. The same
functionality can be achieved with 2 calls to #slice (one could be #at
if you only need to shift one element). There are very good reasons
to make ropes immutable. I'd imagine almost every other
implementation with modifiable ropes will fail when someone starts
modifying a sub-rope that is shared. You'd really need a good COW
(copy-on-write) scheme to allow both transparent sharing and
modification. I didn't see that it was worth the
effort/complexity/risk. I chose the simple functional programming
approach (immutable objects).

* I chose to automatically balance the ropes during concatenation (no
normalize). I used the same tree rotations that are used with AVL
trees. Another option could be to treat these as red-black trees
which might save on some rotations. One reason I automatically
balanced is that it simplifies the interface. The user of the API
doesn't have to worry about when to normalize. A second reason is
that every other rope operation is O(log(n)), so there probably isn't
much benefit in making only concatenation O(1).

* I don't allow ropes to directly use Strings. Instead, a StringRope
is used as a proxy for a String. To use a String directly, I would
have had to check the class, use respond_to, modify String to look
like a rope, etc. Instead, I just went with the pure duck-typing
approach and made multiple Rope-like classes that use different types
of data. My basis for these is the class ArrayRope. There is no
reason why a rope data-structure can't be used with any sequence of
objects instead of just characters. ArrayRope takes an Array-like
object. An rope built out of these is to Array as a conventional rope
is to String. I also added an IORope class to lazily work with files.
Using IORope you could make a text editor that didn't have to read
the whole file in at once. There is no reason you can't mix and match
any of these leaf rope classes (depth 0) within a rope tree.

* #each iterates through elements (i.e. characters) in the rope. It
annoys me that String#each (and IO#each for that matter) iterates over
lines - it should be characters (not necessarily bytes). All of the
Enumerable methods are accessible too.

* I used #at instead of #index because #index is used for searching in
String/Array. Array#at is an exact match.

The main thing I didn't do was implement any regex stuff. I don't see
how this is doable since all of the regex methods are completely tied
to String (not duck-typed). You'd have to convert the whole rope to
string to do anything (which defeats the purpose of the rope).

quiz137.rb (7.49 KB)

···

On 8/31/07, Ruby Quiz <james@grayproductions.net> wrote:

This week's task is to implement the Rope data structure as a Ruby class.

I happened to have implemented ropes in OCaml recently, so I generated a Ruby
extension using rocaml to see how well it would perform.

Without further ado, here are the results I'm getting for SIZE = 512 * 1024,
CHUNKS = 512:

$ time ruby -r himadri_choudhury.rb bm.rb Rope
Build: 0.130000 0.000000 0.130000 ( 0.129476)
Sort: 10.340000 0.050000 10.390000 ( 10.648223)

$ time ruby -rocaml_rope bm.rb OCaml::Rope
Build: 0.020000 0.000000 0.020000 ( 0.018946)
Sort: 0.100000 0.000000 0.100000 ( 0.108499)

$ ruby eric_mahurin.rb StringRope
[...]
Build: 0.060000 0.000000 0.060000 ( 0.057299)
Sort: 0.870000 0.000000 0.870000 ( 0.896493)

For SIZE = 1024, CHUNKS = 16384:

$ ruby eric_mahurin.rb StringRope
[...]
Build: 3.470000 0.040000 3.510000 ( 3.588875)
Sort: 89.110000 0.700000 89.810000 ( 92.179962)

$ time ruby -rocaml_rope bm.rb OCaml::Rope
[...]
Build: 0.360000 0.000000 0.360000 ( 0.378352)
Sort: 3.940000 0.040000 3.980000 ( 4.079140)

At that point the pure Ruby rope is taking over 6 times more memory than
the OCaml one. I ascribe this to iv_tbls being very heavy and to memory
fragmentation.

I benchmarked Himadri's implementation first and was surprised by the
exceedingly large speed difference --- I expected one, not two orders of
magnitude for this code, as there's enough Ruby code in common in qsort to
mask the speed gains in the rope operations. However, Eric's solution proved
that it was just due to a slow Ruby implementation.

Here's the interface definition (extconf.rb):

EXT_NAME = "ocaml_rope"
OCAML_PACKAGES = %w
CAML_LIBS = %w
CAML_OBJS = %w
CAML_FLAGS = ""
CAML_INCLUDES =

require 'rocaml'

Interface.generate("ocaml_rope") do |iface|
  def_class("Rope", :under => "OCaml") do |c|
    rope = c.abstract_type

    fun "empty", UNIT => rope, :as => "new_empty"
    fun "of_string", STRING => rope, :as => "new_from_string"

    method "sub", [rope, INT, INT] => rope, :as => "slice"
    method "concat", [rope, rope] => rope
    method "length", rope => INT
    method "get", [rope, INT] => INT
    method "to_string", rope => STRING, :as => "to_s"
  end
end

require 'rocaml_extconf'

As you can see, OCaml::Rope is purely functional, and the interface differs a
bit from that expected by bm.rb (a modified version that works with immutable
ropes is attached), so I adapted it with the following ocaml_rope.rb, which
also loads the extension:

module OCaml # Rope will be placed in this module
end

require "ocaml_rope.so"

module OCaml
  class Rope
    def self.new(str = "")
      case str
      when String; new_from_string str
      when Rope; str
      when ""; new_empty
      else new_from_string(str.to_str) rescue new_from_string(str.to_s)
      end
    end

    def prepend(rope)
      rope.append(self)
    end

    alias_method :append, :concat
    alias_method :<<, :append
  end
end

The OCaml code is attached, in case anybody wants to look at it.
Incidentally, it weighs a bit under 220 lines, which is also the amount taken
by Himadri's and Eric's solutions. Unlike them, rope.ml features O(1)
concatenation for small elements; this accounts for a large part of the code
and the complexity of some patterns. O(1) concatenation doesn't really affect
performance in the use case exerted by bm.rb anyway.

bm.rb (1.37 KB)

rope.ml (6.46 KB)

···

On Fri, Aug 31, 2007 at 10:19:54PM +0900, Ruby Quiz wrote:

by John Miller

This week's task is to implement the Rope data structure as a Ruby class. This
topic comes out of the ICFP programming competition
(http://www.icfpcontest.com/\) which had competitors manipulating a 7.5 million
character string this year.

--
Mauricio Fernandez - http://eigenclass.org

My solution deviates slightly from the problem specification.
The most important difference is that instead of implementing
a binary tree to store the strings, they are all stored in an
array instead.

The class itself is quite long, since I wanted to implement
many of the methods of the built-in String class. Some of the
methods will require significant work to actually implement.
Most notably the Regexp-related methods, since there is no
way to instruct the Regexp matcher to ask for more characters
once it has reached the end of a string.

Actually, all String methods are implemented, by implementing
method missing and delegating to the composed string if the
string class can handle the method. After delegating to the
string class, we convert any String result to a new rope and
we also make sure to replace our content by the string we
delegated to, to make sure that destructive methods works as
expected.

In fact, we replace the content of our rope as soon as to_s
is called. since the reason for lazy concatenation is to
avoid the concatenation cost, we can just as well keep the
concatenated string when we already had to pay the price.

The benchmark results are:

# String
Build: 0.170000 0.150000 0.320000 ( 0.324341)
Sort: 3.470000 3.630000 7.100000 ( 7.126741)

# ArrayRope
Build: 0.010000 0.010000 0.020000 ( 0.009744)
Sort: 0.130000 0.000000 0.130000 ( 0.138330)

# ArrayRope (with dup)
Build: 0.240000 0.160000 0.400000 ( 0.402201)
Sort: 0.160000 0.000000 0.160000 ( 0.163022)

For the unprotected case, sorting was ~52 times faster,
and building was ~33 times faster.

However, since the string class is not immutable, there is a
slight problem with this approach. The strings could added
to the rope could be modified "under the hood". We can easily
protect against that by making copies of the strings when we
add them to the rope. Since the built-in String shares the
actual data between the two instances (until they change),
this is not so much of a memory issue as one could expect.

By adding dup (in initialize/append/prepend) we end up with
the following times, which trades of some of our speed/memory
for a bit of safety. (This is actually the submitted solution)

Compared with the string approach, building is now (for obvious
reasons) slower than the String, but only about 25%.
Sorting is still much faster than the String case, namely ~44
times as fast.

!g

array_rope.rb (5.45 KB)

···

On 8/31/07, Ruby Quiz <james@grayproductions.net> wrote:

by John Miller

This week's task is to implement the Rope data structure as a Ruby class. This
topic comes out of the ICFP programming competition
(http://www.icfpcontest.com/\) which had competitors manipulating a 7.5 million
character string this year.

I modified my implementation a bit more and provided results along
with the other ruby implementations (sorry Mauricio) submitted. The
benchmark test I used is attached. It can run the original build/sort
that assumes mutable ropes and a build/sort that can also be used with
immutable ropes (in addition to mutable ropes). These tests assume
that << can only take another rope. I included some testing to ensure
the results are correct. I also used the linux /proc/$$/status to get
the memory.

Mahurin::StringRope is almost the same as my previous submission. The
main change was handling a boundary case better so I don't
unecessarily concat an empty rope (a < should have been a <=) - this
almost doubled the performance.

I added the class Mahurin::MutableStringRope which is a wrapper around
an immutable rope. I just reassign the instance variable (@rope) to
make changes. I implemented a bunch of String/Array mutable methods
in this class. This wrapper class hurt performance much more than I
expected (double the run-time).

I also tried out not auto-balancing and using an explicit normalize
(Mahurin::DenormalStringRope). This gave much faster build time as
expected, but the sort time slowed down just as much. I guess for
this random data set (I use a fixed seed), qsort doesn't keep the tree
balanced (pivot doesn't necessarily partition equally). The larger
depth for the non-auto-balanced rope hurts the slice time. I think
biting the bullet for auto-balancing is the better way to go.

I added a subclass for handling flattening concatenations of short
strings (Mahurin::ShortStringRope) just to be complete. It isn't
useful in this benchmark, but also doesn't hurt much (within the 0.01
second error margin).

CPU(user+sys,sec) mem(peak,MB)
------------------- ------------
build sort total build sort class
----- ---- ----- ----- ---- -----
  0.10 1.70 1.80 287 1327 String
  0.01 0.27 0.28 22 153 Porth::Rope
  0.02 0.83 0.85 22 34 Choudhury::Rope *
  0.02 0.06 0.08 22 29 Mahurin::StringRope +
  0.00 0.08 0.08 22 30 Mahurin::DenormalStringRope +
  0.02 0.14 0.16 22 29 Mahurin::MutableStringRope
  0.07 0.10 0.17 151 151 Munkby::ArrayRope
  0.02 0.73 0.75 22 655 Kalenkovich::Rope *

CPU : minimum from 40 iterations
mem : peak over the 40 iterations

* : self-checking failed
+ : immutable benchmark needed

test2.rb (3.45 KB)

mahurin.rb (8.79 KB)

···

On 8/31/07, Ruby Quiz <james@grayproductions.net> wrote:

This week's task is to implement the Rope data structure as a Ruby class.

Fixes for previously posted solution:
- empty node guard in slice
- normalize returning self for Eric's test

------------------------------------------------------------------------------

class NilClass
  def length; 0; end
end

class String
  def shift
    return nil if empty?
    res=self[0]
    self[0]=""
    res
  end
end

class Rope
  attr_reader :length, :left, :right

  def initialize(left=nil,right=nil)
    @left=left if left
    @right=right if right
    @length=left.length+right.length
  end

  def append(what)
    len=what.length
    if (len>0)
      @left=self.dup if @right.length>0
      @right=what
      @length+=len
    end
    self
  end

  alias << append

  def prepend(what)
    len=what.length
    if (len>0)
      @right=self.dup if @left.length>0
      @left=what
      @length+=len
    end
    self
  end

  def to_s
    @left.to_s + @right.to_s
  end

  def (i)
    return i.match(self.to_s)[0] if i.kind_of? Regexp
    if i.kind_of? Range
      pos,last=i.first,i.last
      pos = @length+pos if pos<0
      last = @length+last if last<0
      return nil if pos<0 || last<0
      return slice(pos,last-pos+1)
    end
    i = @length+i if i<0
    return nil if i<0 || i>@length-1
    llen = @left.length
    i<llen ? @left[i] : @right[i-llen]
  end

  def =(i,val)
    #fixnum only
    i = @length+i if i<0
    ""[i]=0 if i<0 || i> @length-1
    @length+=val.length-1
    llen = @left.length
    i<llen ? @left[i]=val : @right[i-llen]=val
  end

  def slice(pos,len)
    return pos.match(self.to_s)[len] if pos.kind_of? Regexp
    pos = @length+pos if pos<0
    return nil if pos<0 || len<0 || pos>@length-1
    llen = @left.length
    return @left.slice(pos,len) if pos+len<=llen || ! @right
    return @right.slice(pos-llen, len) if pos>=llen
    Rope.new(@left.slice(pos,llen-pos),@right.slice(0,len+pos-llen))
  end

  def shift
    return nil if @length==0
    @length-=1
    res = @left.length>0 ? @left.shift : @right.shift
    @left=nil if @left.length==0
    @right=nil if @right.length==0
    res
  end

  def normalize
    r=Rebalancer.new(@length)
    self.traverse { |str| r.append(str) }
    @left, @right=r.get_ropes
    self
  end

  def traverse(&blck)
    @left.kind_of?(String) ? yield( @left) : @left.traverse(&blck) if @left
    @right.kind_of?(String) ? yield( @right) : @right.traverse(&blck) if
@right
  end

end

class Rebalancer
  def initialize len
    @limits=[1,2]
    @slots=
    n=2
    @limits<< n = @limits[-2] + @limits[-1] while n<len
  end

  def append str
    @slots[0] = @slots[0] ? Rope.new( @slots[0],str) : str
    i=0
    while @slots[i].length>@limits[i]
      @slots[i+1] = @slots[i+1] ? Rope.new( @slots[i+1],@slots[i]) :
@slots[i]
      @slots[i] = nil
      i+=1
    end
  end

  def get_ropes
    @slots.compact!
    (@slots.length-1).times { |i|
      @slots[i+1]=@slots[i+1] ? Rope.new(@slots[i+1],@slots[i]) : @slots[i]
      @slots[i]=nil
      i+=1
    }
    [@slots[-1].left,@slots[-1].right]
  end
end

Awesome quiz!

Sorry, I have school, but am veeeeeery close to getting this done with! Need to debug my slice method.

If you all want to help me fix it **hint hint**, make it faster, send hate mail, etc. the code is here:
http://pastie.caboo.se/94560

BTW, some of my code (just little lines and ideas) were..... borrowed..... from Gustav Munkby. Sorry!

I swear it's coming in today!
Ari
--------------------------------------------|
If you're not living on the edge,
then you're just wasting space.

I've fixed this on the web site. Thanks for pointing it out.

James Edward Gray II

···

On Sep 1, 2007, at 12:20 AM, John Miller wrote:

The first line of the quiz should read:
"You may not realize it, but for many changes to the content of a
String,"

Hi All -

I posted my solution in the 'rope.rb' file here:
http://pastie.caboo.se/93242
I slightly modified the nice benchmark tool because normalize doesn't work
on the rope itself for me.

I ran a few iterations with the benchmark with
SIZE = 512 * 1024
CHUNKS = 256.

The following is pretty representative of what I observed:

1) Normal strings:
Build: 0.733000 0.406000 1.139000 ( 1.193000)
Sort: 7.800000 7.238000 15.038000 ( 15.924000)

2) Ropes
Build: 0.047000 0.016000 0.063000 ( 0.060000)
Sort: 10.686000 0.016000 10.702000 ( 10.902000)

3) Ropes with the normalization function
Build: 0.156000 0.000000 0.156000 ( 0.156000)
Sort: 5.741000 0.047000 5.788000 ( 6.212000)

Regards,
Himadri

I've modified my Rope so it runs with the benchmark (and fixed some
bugs).

http://pastie.caboo.se/93256

with the included benchmark:

badcarl@navi> ruby -r lib/rope.rb benchmark.rb String
Build: 0.150000 0.080000 0.230000 ( 0.234209)
Sort: 1.700000 1.850000 3.550000 ( 3.613295)

badcarl@navi> ruby -r lib/rope.rb benchmark.rb Rope
Build: 0.000000 0.000000 0.000000 ( 0.009324)
Sort: 0.280000 0.080000 0.360000 ( 0.372736)

I'm getting around 10x speedup on sorting.

···

On Sep 2, 7:26 am, Carl Porth <badc...@gmail.com> wrote:

I've done parts one and two so far. I'll try to add more in the next
few days.

For simplicity and speed, append and prepend don't modify the
receiver.

If anyone has any questions about my code, I'll be happy to answer.

Carl

require "rubygems"
require "facets"
require "kernel/with"
require "symbol/to_proc"

class String
  def shift
    return nil if empty?
    returning self[0].chr do
      self[0] = ""
    end
  end
end

class Rope
  attr_reader :left, :right, :length

  def Rope.new(*args)
    if args.size <= 2
      super
    else # build balanced tree
      mid = args.size / 2
      args[mid,2] = Rope.new(*args[mid,2])
      Rope.new(*args)
    end
  end

  def initialize(left="",right="")
    @left, @right = left, right
    @length = @left.length + @right.length
  end

  def replace(rope)
    initialize(rope.left,rope.right)
    self
  end

  # clean out empty strings and rebuild tree
  def normalize
    Rope.new(*flat_strings.reject(&:empty?))
  end

  def to_s
    flat_strings.join('')
  end

  def append(str)
    Rope.new(self,str)
  end
  alias_method :<<, :append

  def prepend(str)
    Rope.new(str,self)
  end

  def slice(start,length=@length-start)
    if start > left.length # right
      right.slice(start-left.length,length-left.length)
    elsif left.length < length # left and right
      left.slice(start,left.length) +
      right.slice(start-left.length,length-left.length)
    else # left
      left.slice(start,length)
    end
  end

  def shift
    if letter = left.shift || right.shift
      @length -= 1
      letter
    else
      nil
    end
  end

  def index(letter,start=0)
    if start > left.length # right
      left.length + right.index(letter,start - left.length)
    else # left
      left.index(letter,start) || left.length + right.index(letter)
    end
  rescue
    nil
  end

  # TODO implement rest of methods
  def method_missing(method, *args, &block)
    result = to_s.send(method, *args, &block)
    if result.is_a?(String)
      if method.to_s =~ /!$/
        replace(Rope.new(result))
      else
        Rope.new(result)
      end
    else
      result
    end
  end

protected

  # traverse the tree
  def traverse(&block)
    @left.is_a?(Rope) ? @left.traverse(&block) : block.call(@left)
    @right.is_a?(Rope) ? @right.traverse(&block) : block.call(@right)
  end

  # collect all the flat strings
  def flat_strings
    returning do |ary|
      traverse { |str| ary << str }
    end
  end

end

On Aug 31, 6:19 am, Ruby Quiz <ja...@grayproductions.net> wrote:

> The three rules of Ruby Quiz:

> 1. Please do not post any solutions or spoiler discussion for this quiz until
> 48 hours have passed from the time on this message.

> 2. Support Ruby Quiz by submitting ideas as often as you can:

>http://www.rubyquiz.com/

> 3. Enjoy!

> Suggestion: A [QUIZ] in the subject of emails about the problem helps everyone
> on Ruby Talk follow the discussion. Please reply to the original quiz message,
> if you can.

> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- =-=-=

> by John Miller

> This week's task is to implement the Rope data structure as a Ruby class. This
> topic comes out of the ICFP programming competition
> (http://www.icfpcontest.com/\) which had competitors manipulating a 7.5 million
> character string this year.

> What is a Rope:

> You may not realize it, but for many changes the content to a String, Ruby
> creates a new copy of the original with the modifications applied. For small
> strings that are created once and read often this is actually a very efficient
> way to do thing, but what happens when the string starts to get long, and is
> undergoing a lot of changes? First, the program will spend more and more of its
> processing cycles just copying bits around. Second, the garbage collector will
> be called more and more often to pick up the little stringy scraps you've left
> all over the memory.

> Ropes (the name is a pun for a heavy duty string) are tree structures where a
> node represents the concatenation of its left branch with its right, and leaves
> are flat strings. (This is a little sloppy. A rope may contain shared subtrees,
> and is thus really a directed acyclic graph, where the out-edges of each vertex
> are ordered. We will continue to be sloppy.) E.g. To prepend Text A to Text B,
> one creates a Node (call it N1) with A as its left branch and B as its right.
> To further append Text C create a new Node N2 with its left branch pointing to
> N1 and its right to C. Easy, right? To find out more see Boehm, Atkinson and
> Plass "Ropes: an Alternative to Strings" at:

> http://rubyurl.com/2FRbO

> The task comes in three parts, each increasing in difficulty:

> Part one:

> Create a Rope class that can do the following:

> a. 'append' or 'prepend' a String or another Rope
> (alias the << operator to the append function)
> b. Return the fully concatenated text with 'to_s'
> c. define 'slice' to call to_s.slice
> d. provide a 'length' method

> Part two:

> Add the following:

> a. Add the ability to 'index' a single character given a 0-based offset
> from the beginning of the string.
> b. Add the ability to 'shift' a single character from the front of a Rope.
> (Remove and return the character)
> c. Add your own 'slice' method that returns a String. Implement as many of
> the String method's forms as possible. To run the example code this
> function will only need to understand the slice(offset,length) form.
> Major Bonus for Regex and Substring forms.
> d. "Balance" the tree with a 'normalize' method.
> (see Boehm, Atkinson and Plass 1319 Rebalancing)

> Part three: (bonus)

> Add the following:

> a. Change the 'slice' method to return a Rope. Ideally this method should
> do as little string copying as possible. (Doing this will well
> dramatically increase the speed of the example code)
> b. Allow 'shift' to optionally accept an integer number of characters to
> remove and return.
> c. modify the '<<' operator so that can efficiently append a few
> characters at a time. (see Boehm, Atkinson and Plass 1318 para. 4)
> d. *Major Bonus* Add the ability to append and prepend IO classes in a
> lazy fashion. (see Boehm, Atkinson and Plass 1318 para. 2)

> The following code may help you explore how efficient your code is and show
> where Ropes are useful. `ruby -r /path/to/your/rope/class this_script.rb Rope`
> will run the test with your code. Run the script without arguments to see how
> well String does. Also play around with the SIZE and CHUNKS constants to get a
> feel for how they affect performance.

> require 'benchmark'

> #This code make a String/Rope of CHUNCKS chunks of text
> #each chunck is SIZE bytes long. Each chunck starts with
> #an 8 byte number. Initially the chuncks are shuffled the
> #qsort method sorts them into ascending order.
> #
> #pass the name of the class to use as a parameter
> #ruby -r rope.rb this_file Rope

> puts 'preparing data...'
> TextClass = Object.const_get(ARGV.shift || :String)

> def qsort(text)
> return TextClass.new if text.length == 0
> pivot = text.slice(0,8).to_s.to_i
> less = TextClass.new
> more = TextClass.new
> offset = 8+SIZE
> while (offset < text.length)
> i = text.slice(offset,8).to_s.to_i
> (i < pivot ? less : more) << text.slice(offset,8+SIZE)
> offset = offset + 8+SIZE
> end
> print "*"
> return qsort(less) << text.slice(0,8+SIZE) << qsort(more)
> end

> SIZE = 512 * 1024
> CHUNCKS = 128
> CHARS = %w[R O P E]
> data = TextClass.new
> bulk_string =
> TextClass.new(Array.new(SIZE) { CHARS[rand(4)] }.join)
> puts 'Building Text...'
> build = Benchmark.measure do
> (0..CHUNCKS).sort_by { rand }.each do |n|
> data<< sprintf("%08i",n) << bulk_string
> end
> data.normalize if data.respond_to? :normalize
> end
> GC.start
> sort = Benchmark.measure do
> puts "Sorting Text..."
> qsort(data)
> puts"\nEND"
> end

> puts "Build: #{build}Sort: #{sort}"

Forgot about the results. Here's what I found on my machine:

String
Build: 0.120000 0.080000 0.200000 ( 0.206264)
Sort: 1.680000 0.420000 2.100000 ( 2.112934)
VmPeak: 1192112 kB
VmSize: 493708 kB

StringRope
Build: 0.010000 0.000000 0.010000 ( 0.014414)
Sort: 0.150000 0.020000 0.170000 ( 0.176734)
VmPeak: 38940 kB
VmSize: 37920 kB

so, 20X faster for build and 12.4X faster for sort. Memory looks to
be 10-30X smaller. I ran the build and sort 8 times during the
benchmark and took the min. The memory kept increasing for the String
runs for each of these 8 iterations which doesn't make sense.

···

On 9/3/07, Eric Mahurin <eric.mahurin@gmail.com> wrote:

On 8/31/07, Ruby Quiz <james@grayproductions.net> wrote:
> This week's task is to implement the Rope data structure as a Ruby class.

My implementation is attached along with a modified test. I made a
few changes to what was asked, but this also gave more flexibility.

I happened to have implemented ropes in OCaml recently, so I generated a Ruby
extension using rocaml to see how well it would perform.

[...]

For SIZE = 1024, CHUNKS = 16384:

$ ruby eric_mahurin.rb StringRope
[...]
Build: 3.470000 0.040000 3.510000 ( 3.588875)
Sort: 89.110000 0.700000 89.810000 ( 92.179962)

$ time ruby -rocaml_rope bm.rb OCaml::Rope
[...]
Build: 0.360000 0.000000 0.360000 ( 0.378352)
Sort: 3.940000 0.040000 3.980000 ( 4.079140)

Also for laughs, playing with the GC parameters and with a qsort implemented
in OCaml:

$ OCAMLRUNPARAM=s=512k ruby -rocaml_rope bm.rb OCaml::Rope
[...]
Build: 0.290000 0.000000 0.290000 ( 0.290908)
Sort: 3.410000 0.040000 3.450000 ( 3.547792)
Sort': 0.830000 0.000000 0.830000 ( 0.845180)

Yielding the expected >100x gain over Ruby.

The interface part:

    method "qsort", [rope, INT] => rope
    method "qsort2", [rope, INT] => rope

I implemented the qsort function both in functional and imperative style, for
the sake of those who prefer something that reads similarly to the Ruby
version. The two variants are equally fast.

let (<<) = concat (* OCaml allows you to define new operators *)
let to_i = int_of_string
let to_s = to_string

let rec qsort' size = function
    Empty -> Empty
  > rope ->
      let pivot = to_i (to_s (sub 0 8 rope)) in
      let len = 8 + size in
      let less = ref Empty in
      let more = ref Empty in
      let off = ref len in
        while !off < length rope do
          let slice = sub !off len rope in
            if to_i (to_s (sub !off 8 rope)) < pivot then
        less := !less << slice
            else
        more := !more << slice;
            off := !off + len
        done;
        qsort' size !less << sub 0 len rope << qsort' size !more

let rec qsort size = function
    Empty -> Empty
  > rope ->
      let rec loop r pivot off len max less more =
        if off < max then begin
          if to_i (to_s (sub off 8 r)) < pivot then
            loop r pivot (off+len) len max (less << (sub off len r)) more
          else
            loop r pivot (off+len) len max less (more << (sub off len r))
        end else (less, more) in

      let pivot = to_i (to_s (sub 0 8 rope)) in
      let len = 8 + size in
      let less, more = loop rope pivot len len (length rope) Empty Empty in
        qsort size less << sub 0 len rope << qsort size more

···

On Mon, Sep 03, 2007 at 05:52:31PM +0900, Mauricio Fernandez wrote:

--
Mauricio Fernandez - http://eigenclass.org

Ok, So I'm still trying to figure out what stores the characters in a Rope. A string or an array?

Judging from the fact that you have ArrayRope, I'm thinking it might be an Array.

···

On Sep 3, 2007, at 5:42 AM, Gustav Munkby wrote:

# ArrayRope
Build: 0.010000 0.010000 0.020000 ( 0.009744)
Sort: 0.130000 0.000000 0.130000 ( 0.138330)

# ArrayRope (with dup)
Build: 0.240000 0.160000 0.400000 ( 0.402201)
Sort: 0.160000 0.000000 0.160000 ( 0.163022)

Help!
Ari
--------------------------------------------|
If you're not living on the edge,
then you're just wasting space.

Here is my solution. As a side note, it was probably not the best idea to
stick to the benchmarking code in the quiz, as it forced not the best design
decisions (I'd rather not have default constructor for Rope, and for sure -
no normalizing 'in place'). But - what's done is done.
For slice variants I've decided to have two argument ones in #slice, and one
arg - in #[]

Results:
String

Build: 0.188000 0.032000 0.220000 ( 0.219000)
Sort: 1.578000 0.843000 2.421000 ( 2.422000)

Rope

Build: 0.031000 0.000000 0.031000 ( 0.031000)
Sort: 0.203000 0.000000 0.203000 ( 0.234000)

Code:

···

-------------------------------------------------------------------------------------------------------------------------
class NilClass
  def length; 0; end
end

class String
  def shift
    return nil if empty?
    res=self[0]
    self[0]=""
    res
  end
end

class Rope
  attr_reader :length, :left, :right

  def initialize(left=nil,right=nil)
    @left=left if left
    @right=right if right
    @length=left.length+right.length
  end

  def append(what)
    len=what.length
    if (len>0)
      @left=self.dup if @right.length>0
      @right=what
      @length+=len
    end
    self
  end

  alias << append

  def prepend(what)
    len=what.length
    if (len>0)
      @right=self.dup if @left.length>0
      @left=what
      @length+=len
    end
    self
  end

  def to_s
    @left.to_s+@right.to_s
  end

  def [](i)
    return i.match(self.to_s)[0] if i.kind_of? Regexp
    if i.kind_of? Range
      pos,last=i.first,i.last
      pos=@length+pos if pos<0
      last=@length+last if last<0
      return nil if pos<0 || last<0
      return slice(pos,last-pos+1)
    end
    i=@length+i if i<0
    return nil if i<0 || i>@length-1
    llen=@left.length
    i<llen ? @left[i] : @right[i-llen]
  end

  def []=(i,val)
    #fixnum only
    i=@length+i if i<0
    ""[i]=0 if i<0 || i>@length-1
    @length+=val.length-1
    llen=@left.length
    i<llen ? @left[i]=val : @right[i-llen]=val
  end

  def slice(pos,len)
    return pos.match(self.to_s)[len] if pos.kind_of? Regexp
    pos=@length+pos if pos<0
    return nil if pos<0 || len<0 || pos>@length-1
    llen=@left.length
    return @left.slice(pos,len) if pos+len<=llen
    return @right.slice(pos-llen, len) if pos>=llen
    Rope.new(@left.slice(pos,len),@right.slice(0,len+pos-llen))
  end

  def shift
    return nil if @length==0
    @length-=1
    res=@left.length>0 ? @left.shift : @right.shift
    @left=nil if @left.length==0
    @right=nil if @right.length==0
    res
  end

  def normalize
    r=Rebalancer.new(@length)
    self.traverse { |str| r.append(str) }
    @left,@right=r.get_ropes
  end

  def traverse(&blck)
    @left.kind_of?(String) ? yield(@left) : @left.traverse(&blck) if @left
    @right.kind_of?(String) ? yield(@right) : @right.traverse(&blck) if
@right
  end

end

class Rebalancer
  def initialize len
    @limits=[1,2]
    @slots=[]
    n=2
    @limits<<n=@limits[-2]+@limits[-1] while n<len
  end

  def append str
    @slots[0]=@slots[0] ? Rope.new(@slots[0],str) : str
    i=0
    while @slots[i].length>@limits[i]
      @slots[i+1]=@slots[i+1] ? Rope.new(@slots[i+1],@slots[i]) : @slots[i]
      @slots[i]=nil
      i+=1
    end
  end

  def get_ropes
    @slots.compact!
    (@slots.length-1).times { |i|
      @slots[i+1]=@slots[i+1] ? Rope.new(@slots[i+1],@slots[i]) : @slots[i]
      @slots[i]=nil
      i+=1
    }
    [@slots[-1].left,@slots[-1].right]
  end
end

enough of this nonsense already! where is the gem! :wink:

a @ http://drawohara.com/

···

On Sep 3, 2007, at 7:48 PM, Eric Mahurin wrote:

CPU(user+sys,sec) mem(peak,MB)
------------------- ------------
build sort total build sort class
----- ---- ----- ----- ---- -----
  0.10 1.70 1.80 287 1327 String
  0.01 0.27 0.28 22 153 Porth::Rope
  0.02 0.83 0.85 22 34 Choudhury::Rope *
  0.02 0.06 0.08 22 29 Mahurin::StringRope +
  0.00 0.08 0.08 22 30 Mahurin::DenormalStringRope +
  0.02 0.14 0.16 22 29 Mahurin::MutableStringRope
  0.07 0.10 0.17 151 151 Munkby::ArrayRope
  0.02 0.73 0.75 22 655 Kalenkovich::Rope *

CPU : minimum from 40 iterations
mem : peak over the 40 iterations

--
we can deny everything, except that we have the possibility of being better. simply reflect on that.
h.h. the 14th dalai lama

Eric, thanks for taking the time to run benchmarks for everyone's
solutions. It clears things up when they're run all on the same
machine.

Carl

···

On Sep 3, 6:48 pm, "Eric Mahurin" <eric.mahu...@gmail.com> wrote:

On 8/31/07, Ruby Quiz <ja...@grayproductions.net> wrote:

> This week's task is to implement the Rope data structure as a Ruby class.

I modified my implementation a bit more and provided results along
with the other ruby implementations (sorry Mauricio) submitted. The
benchmark test I used is attached. It can run the original build/sort
that assumes mutable ropes and a build/sort that can also be used with
immutable ropes (in addition to mutable ropes). These tests assume
that << can only take another rope. I included some testing to ensure
the results are correct. I also used the linux /proc/$$/status to get
the memory.

Mahurin::StringRope is almost the same as my previous submission. The
main change was handling a boundary case better so I don't
unecessarily concat an empty rope (a < should have been a <=) - this
almost doubled the performance.

I added the class Mahurin::MutableStringRope which is a wrapper around
an immutable rope. I just reassign the instance variable (@rope) to
make changes. I implemented a bunch of String/Array mutable methods
in this class. This wrapper class hurt performance much more than I
expected (double the run-time).

I also tried out not auto-balancing and using an explicit normalize
(Mahurin::DenormalStringRope). This gave much faster build time as
expected, but the sort time slowed down just as much. I guess for
this random data set (I use a fixed seed), qsort doesn't keep the tree
balanced (pivot doesn't necessarily partition equally). The larger
depth for the non-auto-balanced rope hurts the slice time. I think
biting the bullet for auto-balancing is the better way to go.

I added a subclass for handling flattening concatenations of short
strings (Mahurin::ShortStringRope) just to be complete. It isn't
useful in this benchmark, but also doesn't hurt much (within the 0.01
second error margin).

CPU(user+sys,sec) mem(peak,MB)
------------------- ------------
build sort total build sort class
----- ---- ----- ----- ---- -----
  0.10 1.70 1.80 287 1327 String
  0.01 0.27 0.28 22 153 Porth::Rope
  0.02 0.83 0.85 22 34 Choudhury::Rope *
  0.02 0.06 0.08 22 29 Mahurin::StringRope +
  0.00 0.08 0.08 22 30 Mahurin::DenormalStringRope +
  0.02 0.14 0.16 22 29 Mahurin::MutableStringRope
  0.07 0.10 0.17 151 151 Munkby::ArrayRope
  0.02 0.73 0.75 22 655 Kalenkovich::Rope *

CPU : minimum from 40 iterations
mem : peak over the 40 iterations

* : self-checking failed
+ : immutable benchmark needed

test2.rb
4KDownload

mahurin.rb
11KDownload

"Eric Mahurin" wrote in message
news:29256ea00709031847ua14d891k810fe28236d6edd9@mail.gmail.com...

CPU(user+sys,sec) mem(peak,MB)
------------------- ------------
build sort total build sort class
----- ---- ----- ----- ---- -----
0.10 1.70 1.80 287 1327 String
0.01 0.27 0.28 22 153 Porth::Rope
0.02 0.83 0.85 22 34 Choudhury::Rope *
0.02 0.06 0.08 22 29 Mahurin::StringRope +
0.00 0.08 0.08 22 30 Mahurin::DenormalStringRope +
0.02 0.14 0.16 22 29 Mahurin::MutableStringRope
0.07 0.10 0.17 151 151 Munkby::ArrayRope
0.02 0.73 0.75 22 655 Kalenkovich::Rope *

CPU : minimum from 40 iterations
mem : peak over the 40 iterations

* : self-checking failed
+ : immutable benchmark needed

Eric, thanks a lot for your test, it helped me to find a bug in my solution.
The only thing I do not understand - you've changed the game rules on flight
by making "data = data.normalize" instead of "data.normalize". I'm wondering
how could you get sorting time > 0 for my solution (as it played by old
rules and would give you nothing for such assignment)? :slight_smile:

Anyway, I'm posting a fixed solution (for the bug and for your test) in the
next message, and I'd appreciate if you can re-run it

-- EK

CPU(user+sys,sec) mem(peak,MB)
------------------- ------------
build sort total build sort class
----- ---- ----- ----- ---- -----
  0.10 1.70 1.80 287 1327 String
  0.01 0.27 0.28 22 153 Porth::Rope
  0.02 0.83 0.85 22 34 Choudhury::Rope *
  0.02 0.06 0.08 22 29 Mahurin::StringRope +
  0.00 0.08 0.08 22 30 Mahurin::DenormalStringRope +
  0.02 0.14 0.16 22 29 Mahurin::MutableStringRope
  0.07 0.10 0.17 151 151 Munkby::ArrayRope
  0.02 0.73 0.75 22 655 Kalenkovich::Rope *

Thanks for running the tests! I do think it is a shame, however,
that the table does not include the notion of safety that I mentioned
in my submission. If any of these classes should every go into a
gem as ara mentioned, the user of that gem at least ought to be
able to choose the safe version.

Here is a very simple scenario where such a problem is introduced.
One could probably do worse changes to the source strings, but this
was the first example I came up with.

ss = %w[test append]
r1, r2 = ss.map {|x| Mahurin::StringRope.new(x) }
r1 <<= r2
ss.first << "ing"
r1.length # => 10
r1.to_s.length # => 13

I have only tested Mahurin::StringRope, but would like to know for
which other classes I need to be extra careful when adding strings
to the rope. I did find, however, that by removing my "safety dups",
the combined running time (on my machine) is about the same for
my solution as for Mahurin::StringRope.

!g