I don't know if it was the metaprogramming that scared people away
this week, or perhaps folks are away on summer vacations. In any case,
I'm going to summarize this week's quiz by looking at the submission
from _Matthias Reitinger_. The solution is, as Matthias indicates,
unexpectedly concise. "I guess that's just the way Ruby works."
Matthias' code implements the `Statistician` module in three parts,
each a class. Here is the first class, `Rule`:
class Rule
def initialize(pattern)
@fields = []
pattern = Regexp.escape(pattern).gsub(/\\\[(.+?)\\\]/, '(?:\1)?').
gsub(/<(.+?)>/) { @fields << $1; '(.+?)' }
@regexp = Regexp.new('^' + pattern + '$')
end
def match(line)
@result = if md = @regexp.match(line)
Hash[*@fields.zip(md.captures).flatten]
end
end
def result
@result
end
end
`Rule` makes use of regular expressions built-up as discussed in the
previous quiz, so I'm not going to discuss that here. I will point
out, though, the initialization of the `@fields` member in the
initializer. Note the last `gsub` call: it uses the block form of
`gsub`.
gsub(/<(.+?)>/) { @fields << $1; '(.+?)' }
As the `(.+?)` string is last evaluated in the block, that provides
the replacement in the string. However, makes use of the just-matched
expression to extract the field names. This avoids making a second
pass over the source string to get those fields names, and is arguably
simpler.
The `match` method matches input lines against the regular expression,
returning nil if the input didn't match, or a hash if it did. Field
names (`@fields`) are first paired (`zip`) with the matched values
(`md.captures`), then `flatten`-ed into a single array, finally
expanded (`*`) and passed to a `Hash` initializer that treats
alternate items as keys and values. The end result of `Rule#match`,
when the input matches, is a hash that looks like this:
{ 'amount' => '108', 'name' => 'Tempest Warg' }
That hash is returned, but also stored internally into member
`@result` for future reference, accessed by the last method, `result`.
The next class is `Reportable`:
class Reportable < OpenStruct
class << self
attr_reader :records
def inherited(klass)
klass.class_eval do
@rules, @records = [], []
end
super
end
def rule(pattern)
@rules << Rule.new(pattern)
end
def match(line)
if rule = @rules.find { |rule| rule.match(line) }
@records << self.new(rule.result)
end
end
end
end
This small class is the extent of the metaprogramming going on in the
solution, and it's not much, though perhaps unfamiliar to some. Let's
get into some of it. We'll ignore the `OpenStruct` inheritance for the
moment, coming back to it later.
Everything inside the `Reportable` class is surrounded by a block that
opens with `class << self`. There is a [good summary on the Ruby Talk
mailing list][1], but its use here can be summed up in two words:
class methods. The `class << self` mechanism is not strictly about
class methods, but in this context it affects similar behavior.
Alternatively, these methods could have been defined in this manner:
class Reportable < OpenStruct
def Reportable.rule(pattern)
# etc.
end
def Reportable.match(line)
# etc.
end
# etc.
end
In the end, the `class << self` mechanism is cleaner looking, and also
allows for use of `attr_reader` in a natural way.
The next interesting bit is the `inherited` method. This is a class
method, here implemented on `Reportable`, that is called whenever
`Reportable` is subclassed (which happens repeatedly in the client
code). It's a convenient hook that allows the other bit of
metaprogramming to happen.
klass.class_eval do
@rules, @records = [], []
end
`klass` is the class derived from `Reportable` (i.e. our client's
classes for future statistical analysis). Here, Matthias initializes
two members, both to empty arrays, in the scope of class `klass`. This
serves to ensure that every class derived from `Reportable` gets its
own, separate members, not shared with other `Reportable` subclasses.
This could be done without metaprogramming, but would require effort
from the user.
class Reportable
# class methods here
end
class Offense < Reportable
@rules, @records = [], []
# rules, etc.
end
class Defense < Reportable
@rules, @records = [], []
# rules, etc.
end
If the client forgot to initialize those two members, or got the names
wrong, the class wouldn't work, exceptions would be thrown, [cats and
dogs living together][2]... you get the idea.
You might consider defining those data members in the `Reportable`
class itself, like so:
class Reportable
@rules, @records = [], []
# class methods, without inherited
end
The problem with this is that every `Reportable` subclass would now
share the same rules and records arrays: not the desired outcome.
In the end, the `class_eval` used here, called from `inherited`, is
the right way to do things. It provides a way for the superclass to
inject functionality into the subclass.
Getting back to functionality, `Reportable#match` is straightforward,
but let me highlight one line:
@records << self.new(rule.result)
If you recall, `result` returns a hash of field names to values. And
`Reportable` is attempting to pass that hash to its own initializer,
of which none is defined. This is where `OpenStruct` comes in.
[OpenStruct][3] "allows you to create data objects and set arbitrary
attributes." And `OpenStruct` provides an initializer that takes the
hash Matthias provides, and does the expected.
data = OpenStruct.new( {'amount' => '108', 'name' => 'Tempest Warg'} )
p data.amount # -> 108
p data.name # -> Tempest Warg
By subclassing `Reportable` from `OpenStruct`, all of the client's
classes will inherit the same behavior, which fulfills many of the
requirements provided in the class specification.
The final class, `Reporter`, is pretty trivial. It reads through a
data source a line at a time, finding a matching rule (and creating
the appropriate record in the process) or adding the input line to
`@unmatched` which the client can query later.
Next week we'll take a short break from the Statistician for some
simple stuff. (Part III of Statistician will return in the not-distant
future.)
[1]: http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/57252
[2]: http://www.youtube.com/watch?v=w91-GMc3j7I
[3]: http://www.ruby-doc.org/stdlib/libdoc/ostruct/rdoc/classes/OpenStruct.html
···
--
Matthew Moss <matthew.moss@gmail.com>