R4 - the simplest ruby pre-processor

all this talk about pre-processors and erb got me thinking : what would it
take to code up a ruby pre-procssors that could be used for any laguange. my
first crack is 38 lines of ruby that pre-process any files on the command line
(or stdin) using ruby as the macro language and a convenience function 'macro'
which can be used to define ruby methods which return test that is used as
output. functions declared this way need not do any explicit io. here's some

   harp:~ > cat input.r4
   > # any valid ruby code can go here
   > x = 42
   > # any output produced is used in the generated output
   > p x

   any lines which are not marked with a leading '|' are copied verbetim to the

   > # since the pre-processor language is ruby it can do anything
   > 10.times{|i| p i}

   harp:~ > r4 input.r4

   any lines which are not marked with a leading '|' are copied verbetim to the


and, using r4's macro shortcut to generate c code:

     harp:~ > cat input.c
     > fields = %w( foo bar foobar barfoo )
     > macro('field'){|name| "int #{ name };" }
     > macro('setter') do |name|
     > <<-c
     > int set_#{ name }(self, value)
     > object * self;
     > int value;
     > {
     > return( self->#{ name } = value );
     > }
     > c
     > end
     > macro('getter') do |name|
     > <<-c
     > int get_#{ name }(self)
     > object * self;
     > {
     > return( self->#{ name } );
     > }
     > c
     > end

     struct object {
     > fields.each{|f| field f}

     > fields.each{|f| setter(f); getter(f); }

     harp:~ > r4 input.c

     struct object {
     int foo;
     int bar;
     int foobar;
     int barfoo;

          int set_foo(self, value)
            object * self;
            int value;
            return( self->foo = value );
          int get_foo(self)
            object * self;
            return( self->foo );
          int set_bar(self, value)
            object * self;
            int value;
            return( self->bar = value );
          int get_bar(self)
            object * self;
            return( self->bar );
          int set_foobar(self, value)
            object * self;
            int value;
            return( self->foobar = value );
          int get_foobar(self)
            object * self;
            return( self->foobar );
          int set_barfoo(self, value)
            object * self;
            int value;
            return( self->barfoo = value );
          int get_barfoo(self)
            object * self;
            return( self->barfoo );

and, finally, the source for r4:

     #! /usr/bin/env ruby
     require 'tempfile'

     script = Tempfile::new(File::basename(__FILE__))
     script << DATA.read

     pat = %r/^\s*\|(.*)$/

     hdoc =
     start_hdoc = lambda do
       if hdoc.empty?
         script << "puts <<-'" << hdoc.push("___code_#{ rand(2 ** 42) }___") << "'" << "\n"
     end_hdoc = lambda do
       unless hdoc.empty?
         script << hdoc.pop << "\n"
     ARGF.each do |line|
       m = pat.match line
       if m
         code = m[1]
         script << code << "\n"
         script << line

     load script.path

     class Object
       def macro(m, &b)
         klass = (Class === self ? self : (class << self; self; end))
         klass.module_eval do
           define_method(m){|*a| puts b.call(*a)}

obivious the char marking pre-processor lines could be anything - but other
than that there is special markup.





Alternative implementation ("r5"):

    def macro(name, &block)
      Kernel.send(:define_method, name) { |*args|
        puts block.call(args)

    send(($DEBUG ? "puts" : "eval"), ARGF.readlines.map { |line|
      if line[0] == ?|
        "puts #{line.chomp.dump}\n"

Happy hacking,


Very cool Ara.

What about making the '|' user-definable, as well as which it isolates:
the code or the text. Use '#' instead and reverse it, you'd have a ruby
program runner that spits out all its comments :slight_smile:


Short and sweet. :slight_smile:
Reminds me of my BrainF*ck variant implemented in 17 lines of readable


With a tiny change to this you can also use ruby's embedded value substitution (interpolation), like:

     > some_number = 42

     Hello there!

     some number: [[[#{some_number}]]]

to get output like:

     Hello there!

     some number: [[[42]]]

That one change is from:

        "puts #{line.chomp.dump}\n"


     "puts \"#{line.chomp}\"\n"

(just added the quotes).

This is a very interesting little toy. Maybe not a toy at all.



So, playing around a bit more, I've got something that solves a problem I have right now (with code generation)... using this scheme to add methods to an arbitrary class.

Keeping with the theme, maybe call this r6?

r6.rb -----------------------------
#! /usr/bin/env ruby
def macro(name, &block)
   Kernel.send(:define_method, name) { |*args|
     puts block.call(args)

def build_script(template_file_name, class_name, method_name)

   # Build a method (called 'method_name') in the class 'class_name' that
   # will execute the template (in the file 'template_file_name'). There will
   # be an optional argument that defaults to the empty string (and so this
   # method will, by default, build a new string representing the result). If
   # the argument is supplied, it must respond to the "<<" (append) method.


Alternative implementation ("r5"):

   # The result variable is available in the template. To write to the result
   # from Ruby code, result << sprintf("hello %s", "world") in the template
   # will get its output where expected.

   File.open(template_file_name) do | file |
     r = "
class #{class_name}
   def #{method_name}(result=\"\")
     result << \"\"
     while line = file.gets
       if line[0] == ?|
         r << " #{line[1..-1]}"
         r << " result << \"#{line.chomp}\\n\"\n"
     r << "

### and this is how it can be used...

# Define a class to hold the template methods. There is an attribute 'message'
# that can be set by the program invoking the template, and referred to by the
# templates. You can put any attributes you want in there.

class R6_Template
   attr_accessor :message

# Assume that the command line arguments are all specifying the name of a
# template file. Open each file and pass it to the build script method. When
# the script is built, 'eval' it.

ARGV.each { | script_name |
   method_name = File::basename(script_name, ".*")
   the_script = build_script(script_name, "R6_Template", method_name)
   #puts the_script ## if you want to see what it looks like, uncomment this line
   eval the_script

# For illustrative purposes, go over the command line arguments and execute
# the corresponding method defined above.

template = R6_Template.new()
ARGV.each { | script_name |
   method_name = File::basename(script_name, ".*")
   template.message = sprintf("this is script '%s'", method_name)
   puts "#{method_name}*******************"
   what = template.send(method_name);
   puts "{{{#{what}}}}*******************"

# This time call the play method directly. There must be a template
# called 'play' (sans extenstion) for this to work.

template.message = "this is script 'play' -- called explicitly"
puts "!!play!!*******************"
what = template.play()
puts "{{{#{what}}}}*******************"

# Now do the same thing as the illustrative loop above but writing to a file
# with the script name and a ".out" extension.

ARGV.each { | script_name |
   method_name = File::basename(script_name, ".*")
   template.message = sprintf("this is script '%s'", method_name)
   File.open(sprintf("%s.out", method_name), "w") { | file |
     template.send(method_name, file);

# Write to a file called "play-x.out", again, there must be a play template
# defined.

template.message = "this is script 'play' -- called explicitly"
File.open("play-x.out", "w") { | file |

# Build up a single string by applying all the templates
long_string = ""
ARGV.each { | script_name |
   method_name = File::basename(script_name, ".*")
   template.message = sprintf("this is script '%s'", method_name)
   what = template.send(method_name, long_string);
puts "!!{{{#{long_string}}}}!!*******************"

play.r6 -----------------------------

some_number = 42
def gen_some_number
  42 + rand

Hello there (play)! #{@message}

     some number: [[[#{some_number}]]]
gen some number: [[[#{gen_some_number}]]]
play_more.r6 -----------------------------

some_number = 42
def gen_some_number
  42 + rand

Hello there(play_more)! #{@message}

another_number = 99
result << play()

     some number: [[[#{some_number}]]]
gen some number: [[[#{gen_some_number}]]]
another number: [[[#{another_number}]]]

play again:... #{play}

Note that the play_more template uses the play template (twice)

I've also taken to calling them 'templates'...


a version which optionally interpolates - six lines, ergo r6

     harp:~ > cat in
     > x = 42

     using the -i switch i can interpolate x as #{ x }

     harp:~ > r6 in

     using the -i switch i can interpolate x as #{ x }

     harp:~ > r6 --interpolate in

     using the -i switch i can interpolate x as 42

     harp:~ > cat r6
     #! /usr/bin/env ruby

     def macro(m, &b); Kernel::send('define_method', m){|*a| puts b.call(a)}; end

     debug = $DEBUG || ENV['DEBUG'] || ARGV.delete('-d') || ARGV.delete('--debug')

     interp = $INTERPOLATE || ENV['INTERPOLATE'] || ARGV.delete('-i') || ARGV.delete('--interpolate')

     meth = debug ? 'puts' : 'eval'

     src = ARGF.readlines.map{|l| l =~ %r/^\s*\|(.*)/ ? $1 : (interp ? "puts \"#{ l.chomp }\"" : "puts #{ l.chomp.dump }")}

     send meth, src.join("\n")




gee, I just read what I wrote and it borders on nonsense (I don't
exactly think in words, so its not alwasy easy to translate) BUt to
explain better:

     struct object {
     > fields.each{|f| field f}

change marker, becomes:

     struct object {


     # fields.each{|f| field f}


     #struct object {
       fields.each{|f| field f}

So, using this setup a normal ruby program would run while printing out
it's comments. Not useful, but I just thought it was interesting.

and lost the dump



Ara.T.Howard schrieb:

a version which optionally interpolates - six lines, ergo r6
    src = ARGF.readlines.map{|l| l =~ %r/^\s*\|(.*)/ ? $1 : (interp ? "puts \"#{ l.chomp }\"" : "puts #{ l.chomp.dump }")}

Just a minor simplification: ARGF is kind of an Enumerable, so you can omit the call to readlines.

BTW: nice idea.


Trans wrote:

So, using this setup a normal ruby program would run while printing out
it's comments. Not useful, but I just thought it was interesting.

I think this is a very cool idea for an interesting debug mode.


Bob Hutchison wrote:

That one change is from:

        "puts #{line.chomp.dump}\n"

    "puts \"#{line.chomp}\"\n"
(just added the quotes).

and lost the dump

And broke it.
   > blah = 42
   Hello. I'm an evil template. ", `rm -rf /`, "I advise against running me as root.



I have a code generation problem, and this looks to address my requirements
very well.

careful - if you go the

    "puts \"#{line.chomp}\"\n"

route an errant '#{' or '`' in your input will cause a syntax error. the
orginal method of

   "puts "#{ line.chomp.dump }\n"

will always work though.





Bob Hutchison <hutch@recursive.ca> writes:

That one change is from:

        "puts #{line.chomp.dump}\n"

    "puts \"#{line.chomp}\"\n"
(just added the quotes).

and lost the dump

And broke it.
  > blah = 42
  Hello. I'm an evil template. ", `rm -rf /`, "I advise against
running me as root.


Broke it? Nah, it was already 'broken'. And anyway, that's what I
wanted to do, and really would've 'broken' it to achieve that :slight_smile:

You broke it because " can't appear as-is in the file anymore.
(Which is likely what I want when I code C or something.)


I have a code generation problem, and this looks to address my requirements
very well.

careful - if you go the

    "puts \"#{line.chomp}\"\n"

route an errant '#{' or '`' in your input will cause a syntax error. the
orginal method of

  "puts "#{ line.chomp.dump }\n"

will always work though.

Except that '#{' or '`' will be escaped and ignored. Still, point taken. Maybe a 'be_careful' option? But as I said in a previous post, a 'be_safe' option isn't really possible. Now, I've not used Ruby in a while, maybe I'm missing something.



Bob Hutchison <hutch@recursive.ca> writes:

That one change is from:

        "puts #{line.chomp.dump}\n"

    "puts \"#{line.chomp}\"\n"
(just added the quotes).

and lost the dump

And broke it.
  > blah = 42
  Hello. I'm an evil template. ", `rm -rf /`, "I advise against
running me as root.


Broke it? Nah, it was already 'broken'. And anyway, that's what I
wanted to do, and really would've 'broken' it to achieve that :slight_smile:

You broke it because " can't appear as-is in the file anymore.
(Which is likely what I want when I code C or something.)


     harp:~ > cat in
     > x = 42

     using the -i switch i can interpolate x as #{ x }

     quote " works

     quote ' works

     harp:~ > r7 in

     using the -i switch i can interpolate x as #{ x }

     quote " works

     quote ' works

     harp:~ > r7 --interpolate in

     using the -i switch i can interpolate x as 42

     quote " works

     quote ' works

     harp:~ > cat r7
     #! /usr/bin/env ruby

     def macro(m, &b); Kernel::send('define_method', m){|*a| puts b.call(a)}; end

     d = $DEBUG || ENV['DEBUG'] || ARGV.delete('-d') || ARGV.delete('--debug')

     i = $INTERPOLATE || ENV['INTERPOLATE'] || ARGV.delete('-i') || ARGV.delete('--interpolate')

     h = '_' * 42

     m = d ? 'puts' : 'eval'

     s = ARGF.readlines.map{|l| l =~ %r/^\s*\|(.*)/ ? $1 : (i ? %Q(puts <<#{ h }\n#{ l.chomp }\n#{ h }) : %Q(puts #{ l.chomp.dump }))}

     send m, s.join("\n")



