···
On Sun, 2003-03-23 at 22:15, Gavin Sinclair wrote:
I’d like someone, hopefully including Julian and Jim (Weirich), to
demonstrate how to iterate over twop lists in parallel. Extra marks
for setting a realistic example problem and solving it!
#!/usr/bin/env ruby
= Introduction
[On Monday, March 24, 2003 in ruby-talk, Gavin Sinclair writes]
"I’d like someone, hopefully including Julian and Jim (Weirich), to
demonstrate how to iterate over two lists in parallel. Extra marks
for setting a realistic example problem and solving it!"
How can I refuse an offer like that!
Here’s what’s ahead … We will start with the Same Fringe problem.
Then we will implement the solution three times (once by converting
everything to arrays, once with a hand-written external iterator,
and once with a generator). We will finish up with a discussion of
how the generator implementation works (using continuations) works
in Ruby.
Sound good?
= The Same Fringe Problem
Lets imagine that we have a binary tree filled with sorted data.
This kind of tree is a common data structure used to implement
hash-like tables where the keys are stored in sorted order.
These binary trees have the property that depth-first visit of the
leaf nodes will visit the leaves in sorted order (we are ignoring
interior nodes for this example). However, two trees containing the
same leaf nodes may be structured differently (due to the order in
which nodes are inserted into the tree).
We want to write some code that will determine if two binary trees
have the same fringe (i.e. leaf) nodes.
== Example
Consider the following trees …
tree1 tree2 tree3
o o o
/ \ / \ / \
A o o C o C
/ \ / \ / \
B C A B A X
Trees 1 and 2 have the same fringe leaves, although they are
internally structured differently. Tree 3 does not have the same
fringe as either tree 1 or 2, due to the leaf node X.
= The Code
So lets take a look at the code to handle this problem.
== Class Node
We will use a Node class to represent the interior nodes of our
binary tree (leaf nodes will simply be strings). A node has an
accessor for its left and right children.
class Node
attr_accessor :left, :right
def initialize(left, right)
@left, @right = left, right
end
end
Iterating through a binary tree is pretty easy. You just
recursively visit the left children and then the right children.
Since we are ignoring the interior nodes, we don’t have to worry
about pre-order vs post-order.
We use implement the iteration in term of +each+ and include
+Enumerable+, making our trees respond to the standard Ruby internal
iterator protocol.
class Node
include Enumerable
def each(&block)
handle_child(self.left, &block)
handle_child(self.right, &block)
end
def handle_child(value, &block)
case value
when Node
value.left.each(&block)
value.right.each(&block)
else
yield(value)
end
end
end
== External Iterators
Internal iterators are difficult to use in solving the same-fringe
problem. The reason is that we want to walk both trees, one element
at a time, so we can compare the elements. Because internal
iterators encapsulate the iteration algorithm, it is difficult to
change the algorithm to handle two trees.
Fortunately it is easy to solve the same-fringe problem with
external iterators. In the following code, assume an iterator
provides a method named +get+ that returns the next available item
from the iterator. When the iterator is done, +get+ will return
nil.
== Same Fringe Function Using External Iterators
Here is the same_fringe function written with external iterators.
The function takes two iterators as arguments, one iterator for each
of the two trees being compared. Same_fringe will only return true
if each element from both trees are identical all the way to the end
of the list. Since the iterators just return a sequence of leaf
nodes, we are ignoring the tree structure during the comparison.
def same_fringe(it1, it2)
while v1 = it1.get
v2 = it2.get
return false if v1 != v2
end
return v1 == it2.get
end
Notice that +same_fringe+ doesn’t care what kind of iterator it is
given, as long as the iterator conforms to the +get+ specification
we gave. We will write several iterators and pass them to
+same_fringe+.
== Support Code
Before we write some iterators, here is some support code that we
will use in the demonstration.
show will show the contents of a tree using the given iterator.
show is a good example of using our external iterator.
def show(msg, it)
print msg
while v = it.get
print " ", v
end
puts
end
show_sf will run the +same_fringe+ function with the given
iterators and show the results.
def show_sf(msg, expect, it1, it2)
result = same_fringe(it1, it2)
puts “Same Fringe, #{msg}, should be #{expect}, was #{result}”
fail “Unexected Result!” if expect != result
end
Finally, demonstrate will create some trees and display them.
Demonstrate expects a block that will return an iterator of the
appropriate type.
def demonstrate(msg)
tree1 = Node.new(“A”, Node.new(“B”, “C”))
tree2 = Node.new(Node.new(“A”, “B”), “C”)
tree3 = Node.new(Node.new(“A”, “X”), “C”)
puts “Using #{msg} …”
show("Tree1 = ", yield(tree1))
show("Tree2 = ", yield(tree2))
show("Tree3 = ", yield(tree3))
show_sf(“tree1 vs tree2”, true,
yield(tree1),
yield(tree2))
show_sf(“tree1 vs tree3”, false,
yield(tree1),
yield(tree3))
puts
end
----
== Array Iterator
Our first external iterator will be one that walks through an array.
An array iterator is trivial to implement (as you can see). Also,
it is easy to convert a tree to an array since the Node objects
support the +Enumerable+ protocol. We just need to call +to_a+ and
we have an array of leaf nodes.
Here is the array iterator…
class ArrayIterator
def initialize(collection)
@array = collection.to_a.dup
end
def get
@array.shift
end
end
We use our +demonstrate+ function to show that the array iterater
does do its job.
demonstrate(“Array Iterator”) { |tree| ArrayIterator.new(tree) }
=== Output for Array Iterator
Using Array Iterator …
Tree1 = A B C
Tree2 = A B C
Tree3 = A X C
Same Fringe, tree1 vs tree2, should be true, was true
Same Fringe, tree1 vs tree3, should be false, was false
----
== Tree Iterator
Although trivial to write, the array iterator suffers from a
potential problem. The tree must be converted to an array before
the +same_fringe+ function can even begin to look at leaf nodes. If
the tree is large, then this can be problematic.
Can we create an external iterator that examines the tree "in
place"? Certainly! Here is the code.
class TreeIterator
def initialize(node)
@tree = node
@stack =
end
Get the next leaf node.
def get
if @tree
t = @tree
@tree = nil
left_most(t)
elsif ! @stack.empty?
node = @stack.pop
left_most(node.right)
else
nil
end
end
Find the left-most leaf from +node+.
def left_most(node)
while Node === node
@stack.push(node)
node = node.left
end
node
end
end
And again we use the +demonstrate+ method to show that the
TreeIterator works.
demonstrate(“Tree Iterator”) { |tree| TreeIterator.new(tree) }
=== Output for Tree Iterator
Using Tree Iterator …
Tree1 = A B C
Tree2 = A B C
Tree3 = A X C
Same Fringe, tree1 vs tree2, should be true, was true
Same Fringe, tree1 vs tree3, should be false, was false
----
== Generators
The tree iterator works great, but it was a big pain to code up.
Did you notice the explicit stack that was used to remember the
nodes that had not yet been processed? It works, but the logic is
much more obscure than the internal iterator logic.
Wouldn’t it be nice if it were possible to write an external
iterator reusing the logic from the internal iterator.
Fortunately, it is possible to do exactly that using generators. A
generator is simply a resumable function. After a generator returns
a value, the next time it is called it starts executing immediately
after the last return, remembering the values of all the local
variables in the function.
Python generators use the keyword yield to designate returning a
value and the resumption point in a generator. Since Ruby already
uses yield for blocks, we will use a generate method to return
generated values.
Here is a simple example that generates the squares of the numbers 0
through 3. Notice that the iteration logic goes into a block given
to the generator constructor. The generator object returned from
the constructor can be used as an iterator, just like our Tree and
Array iterators.
$ irb --simple-prompt
>> require ‘generator’
=> true
>> it = Generator.new { |g| 4.times { |i| g.generate(i**2) } }
>> it.get
=> 0
>> it.get
=> 1
>> it.get
=> 4
>> it.get
=> 9
>> it.get
=> nil
>> exit
$
We will look at the guts of the generator in a moment. In the
meantime, lets use a generator to solve the same-fringe problem.
First we load the generator code.
require ‘generator’
Then we define a function that creates our generator for us. Since
we already have an internal iterator that works, we will simply use
that in the iteration logic of our generator.
def tree_generator(tree)
Generator.new do |generator|
tree.each { |value| generator.generate(value) }
end
end
And we show that the generator works.
demonstrate(“Generator”) { |tree| tree_generator(tree) }
=== Output for Tree Iterator
Using Generator …
Tree1 = A B C
Tree2 = A B C
Tree3 = A X C
Same Fringe, tree1 vs tree2, should be true, was true
Same Fringe, tree1 vs tree3, should be false, was false
----
== Writing the Generator Code
So how are generators written in Ruby? … Very Carefully!
No, seriously! We need to use continuations.
=== Continuations
The key to writing generators is using continuations. A
continuation is a callable object (i.e. you can send a :call message
to a continuation) that encodes the return point of a function.
Continuations are created by given the +callcc+ (or "Call with
Current Continuation") method a block. +callcc+ passes its own
continuation (the “current” continuation) to the block as a
parameter. When the continuation is called, that particular
instance of +callcc+ will return immediately with the value passed
in the call argument list.
Here’s a short example …
x = callcc do |continuation|
puts “In CALLCC”
continuation.call(“HI”)
puts “This is never printed”
end
puts “X = #{x}”
Will print …
In CALLCC
X = HI
In this example, callcc looks like a version of setjump/longjump in
C where the continuation plays the role of the jump buffer. There
is one big difference. In C, you must never call longjump outside
the scope of the setjump call. A continuation is allowed to be
called anywhere! Even outside the scope of the callcc method.
We can use this fact to create resumable functions. The following
function will return once (returning a 1). We then resume the
function and it returns a second time, assigning 2 to x.
def resumable
callcc do |continuation|
$resume_point = continuation
return 1
end
return 2
end
x = resumable
puts “Again, X = #{x}”
$resume_point.call if x != 2
Will print …
Again, X = 1
Again, X = 2
Wow! This gives use the tools we need to create resumable functions.
The key is to capture a continuation that will allow us to resume a
function at the point we need.
=== The Generator Class
The key to our generator design is tracking the two different
continuations we need. The @resume_generator continuation will be
called from the main code (via the +get+ function) and will resume
into the generator. The @resume_mainline continuation will be
used by the generator to return to the mainline code.
Here’s the start of our generator class.
class Generator
The initialize function is carefully designed to initialize the
@resume_generator instance variable. We use +callcc+ to capture
the continuation we want, and then return early from +initialize+.
The rest of +initialize+ will be run the first time we resume the
generator.
def initialize
callcc do |continuation|
@resume_generator = continuation
return
end
yield(self)
while true
generate(nil)
end
end
The +get+ method is called from the mainline code to resume the
generator (to eventually return the next value). Our tasks are to
…
* Capture the mainline continuation.
* Resume the generator.
def get
callcc do |continuation|
@resume_mainline = continuation
@resume_generator.call
end
end
Finally we have the +generate+ method. It is called from the
generator to return values to the mainline code. Our tasks are to
…
* Capture the continuation for the generator.
* Resume the mainline code, while returning the next generated
value.
def generate(value)
callcc do |continuation|
@resume_generator = continuation
@resume_mainline.call(value)
end
end
end
And that’s it.
----
= Further Reading
Lecture notes – Applications of continuations,
Daniel Friedman
Continuations made simple and Illustrated,
Denys Duchier
Search for “continuations” on Google for even more references.
“Beware of bugs in the above code; I have only proved it correct,
not tried it.” – Donald Knuth (in a memo to Peter van Emde Boas)