A proposal for Parsing Ruby in Ruby

R3: A Ruby native parsing framework ported from Perl6 Rules.

Introduction:
R3 is a Ruby native implementation of Perl6's Rule engine specification. Perl6 Rules is a good compiler tool set for dynamic languages due to it heavy use of regular expressions and language native constructs such as classes, objects, strings, streams etc. R3 is envisioned as a possible new language feature to Ruby, not merely an add-on library or external compiler tool set such as lex and yaxx, or Antlr.
Power and Rational of Rules:

Perl6 Rules are the basis for many features in Perl6 and can be the basis of corresponding features in other dynamic languages. LISP style macros for dynamic languages can be defined in rules. Dynamically syntax can be implemented using rules. In other words the ability to change the syntax of the language on the fly. Support for first class Domain Specific Languages and corresponding syntax can be implemented using rules. Rules can be used to parse any type of stream: strings, IO streams, network protocol streams.
Proposed work:

    * Implement sub rules and nested match objects to support parsing.
    * Implement sufficient backtracking support to parse Ruby syntax.
    * Implement a Ruby Grammar defined using R3 rules syntax.
    * Implement a Grammar/Rules DSL approach where a Grammar is a class
      and Rules are methods in the class.
    * Implement inline use of rules as an alternative for Ruby Regexs.

Delimitations:

    * The features of Perl6 rules are extensive, R3 will initially only
      support a minimum subset in an attempt to parse Ruby syntax.
    * R3, initially being a port or translation of Perl6 ideas, will
      utilize much of the Perl6 syntax for rules. This is a benefit, as
      work commences, since much of the syntax come from regular
      expressions, which is widely shared across languages.
    * Although not it final potential, R3 will be initially implemented as

Background:
The Perl6 Regex/Rules engine interests me and I think it is a powerful way to build parsers. In order to better understand how it works, I originally tried to help implement rules in Haskell. Recently, I attempted to port PGE, another Perl6 Rules implementation to Ruby and later to Perl5. PGE is written in PIR which is parrot assembly(Perl5 data types, with assembly syntax). Getting from labels and jumps to Perl5 was doable since Perl5 has gotos. Ruby was a little harder. Gotos had to be emulated using blocks and exceptions. Each of theses attempts ended in abandonment, but with each try, I became more familiar with the design and syntax of rules.
My latest attempt(R3) has been to port Pugs::Compiler::Rules to Ruby and I'm finally getting some traction. The control structure of Pugs::Compiler::Rules, continuation passing style(CPS), is easily implemented in Ruby using Procs. As a result, this port has been much more straight forward than previous attempts.

R3 mirrors the architecture of Pugs::Compiler::Rules.

    * Match.rb => Regex/Rules matches returned by rules, which are
      chained together to form rudimentary parse trees.
    * RuntimeRule.rb => Runtime regex primitives needed to implement rules.
    * GrammarRule.rb => Grammar of Rules expressed using primitives from
      RuntimeRule.rb.
    * RuleCompiler.rb => uses the Grammar in GrammarRule.rb to parse
      rules into rule parse trees.
    * RubyEmitter.rb => Takes a rule parse tree from RuleCompiler and
      emits ruby code to parse the grammar specified by rules.

R3 is just now beginning to parse the Perl6 Rule syntax. All of the modules above except the RubyEmitter.rb module have primitive functionality. Work on the RubyEmitter has not yet begun. R3 is is currently hosted in a private SVN repository. See tewk.com/r3.tgz <http://tewk.com/r3.tgz>.

Perl6 Rules Reference Implementations

    * PGE Parrot Grammar Engine (Written in PIR, uses coroutines for
      continued parsing)

    * https://svn.perl.org/parrot/trunk/compilers/pge

# Pugs Rules Engine (Written in Haskell)

    * http://svn.openfoundry.org/pugs/src/Text/Parser

# Pugs::Compiler::Rules (Written in Perl5, generates Perl5 code, uses continuation passing style to backtrack)

    * http://svn.openfoundry.org/pugs/misc/pX/Common/Pugs-Compiler-Rule

# My Previous Attempts

    * http://svn.openfoundry.org/pugs/misc/pX/tewk

See also:

    * http://search.cpan.org/~ltoetsch/parrot-0.4.3/compilers/pge/README.pod
    * http://dev.perl.org/perl6/doc/design/syn/S05.html
    * http://dev.perl.org/perl6/doc/design/apo/A05.html
    * http://dev.perl.org/perl6/doc/design/exe/E05.html

Kevin Tew <tewk@tewk.com> writes:

The control structure of
Pugs::Compiler::Rules, continuation passing style(CPS), is easily
implemented in Ruby using Procs.

Does that scale given Ruby does not perform tail call optimization?

···

--
Christian Neukirchen <chneukirchen@gmail.com> http://chneukirchen.org

Maybe not, but on the other hand, ruby does have real continuations. Maybe he can try doing CPS with Cs instead of Procs.

I personally think this is kind of a silly idea. Perl6 rules are cool, but they are very very Perl. Regular expressions on steroids, not very OO. I think a ruby in ruby project would be much better served by a parser combinator library. The problem is having a performant one in ruby.

···

On Apr 28, 2006, at 8:54 AM, Christian Neukirchen wrote:

Kevin Tew <tewk@tewk.com> writes:

The control structure of
Pugs::Compiler::Rules, continuation passing style(CPS), is easily
implemented in Ruby using Procs.

Does that scale given Ruby does not perform tail call optimization?

I don't believe I've ever heard of anybody using continuations to implement continuations. Instead they use CPS and callcc falls right out for you.

···

On Apr 28, 2006, at 5:54 AM, Christian Neukirchen wrote:

Kevin Tew <tewk@tewk.com> writes:

The control structure of
Pugs::Compiler::Rules, continuation passing style(CPS), is easily
implemented in Ruby using Procs.

Does that scale given Ruby does not perform tail call optimization?

--
Eric Hodel - drbrain@segment7.net - http://blog.segment7.net
This implementation is HODEL-HASH-9600 compliant

http://trackmap.robotcoop.com

Logan Capaldo <logancapaldo@gmail.com> writes:

Kevin Tew <tewk@tewk.com> writes:

The control structure of
Pugs::Compiler::Rules, continuation passing style(CPS), is easily
implemented in Ruby using Procs.

Does that scale given Ruby does not perform tail call optimization?

Maybe not, but on the other hand, ruby does have real continuations.
Maybe he can try doing CPS with Cs instead of Procs.

He certainly can try, but in my experience, they scale even less. :stuck_out_tongue:
However, he probably can reclaim the stack space with exceptions or
catch/throw.

I personally think this is kind of a silly idea. Perl6 rules are
cool, but they are very very Perl. Regular expressions on steroids,
not very OO. I think a ruby in ruby project would be much better
served by a parser combinator library. The problem is having a
performant one in ruby.

I don't think it's a silly idea, and parsing per-se is not a very
object-oriented task. Perl6 rules can be implemented using parser
combinators pretty easily (check the Pugs source for details. :-)), so
if that approach is taken, everyone benefits.

I, for one, would welcome having Perl6 rules or a Rubyish feel-alike
as a Ruby library.

···

On Apr 28, 2006, at 8:54 AM, Christian Neukirchen wrote:

--
Christian Neukirchen <chneukirchen@gmail.com> http://chneukirchen.org

I agree. I looked at perl6 rules before starting to write a parser
library in ruby. You can look for Grammar on rubyforge to see what
I've done. I started working on making a "Grammar" for parsing ruby
but didn't get so far. And I've been away from ruby for several
months now because of work. My main problem was converting the LR
yacc grammar to LL. Now that somebody has done this for ANTLR (also
LL), it shouldn't be too hard.

I solved the performance problems by having the grammar objects
generate there own code (flattening their grammar children also) and
eval it when needed (under the covers - no parser generator step
needed from a usability perspective). I don't think I checked all of
those enhancements into CVS yet :frowning: Last I checked, it was faster and
more memory efficient than racc.

Since I've been busy with work stuff, I wouldn't mind some help with this.

···

On 4/28/06, Logan Capaldo <logancapaldo@gmail.com> wrote:

On Apr 28, 2006, at 8:54 AM, Christian Neukirchen wrote:

> Kevin Tew <tewk@tewk.com> writes:
>
>> The control structure of
>> Pugs::Compiler::Rules, continuation passing style(CPS), is easily
>> implemented in Ruby using Procs.
>
> Does that scale given Ruby does not perform tail call optimization?
>

Maybe not, but on the other hand, ruby does have real continuations.
Maybe he can try doing CPS with Cs instead of Procs.

I personally think this is kind of a silly idea. Perl6 rules are
cool, but they are very very Perl. Regular expressions on steroids,
not very OO. I think a ruby in ruby project would be much better
served by a parser combinator library. The problem is having a
performant one in ruby.

Aha, there's the key point. The OP was proposing making Perl6 style rules part of the language. That's more what I drew exception to, although I did go off on a tangent of sorts, as regards my feelings for the syntax.

···

On Apr 29, 2006, at 9:56 AM, Christian Neukirchen wrote:

I, for one, would welcome having Perl6 rules or a Rubyish feel-alike
as a Ruby library.