Since I was giving a presentation about Grammar at the Lonestar Ruby
Conference (which ended today), I thought it was about time to release
another version of Grammar. I posted it yesterday here:
* second complete rewrite since Grammar 0.5
* BNF-like grammar written directly in Ruby
* LL(1) parser
* separation of Grammar and parser generator ("engine")
* engines could generate a parser in a particular target language or do
something else with the Grammar tree
* current engines: Ruby (generate stand-alone very flat Ruby parser), Ruby0
(directly parse while traversing Grammar), Ruby2CExt (Ruby + convert to C
using ruby2cext)
* lexer and parser grammars are specified in exactly the same way - one
parses characters and the other tokens
* can write a lexer-free parser
* may define tokens however is appropriate
* methods for a lexer (or just token) Grammar to a parser Grammar -
including multi-threading
* arbitrary lookahead can be achieved with the backtrack method
* tail-call optimization of recursion (right recursion)
* directly handles left recursion (historically only doable with (LA)LR
parsers only).
* very very fast - on par with the best hand crafted/tuned recursive descent
LL(1) or Regexp (LL(*)) parsers. 100X faster than many other parser
generators. And that is before Ruby2CExt.
Here is a benchmark for parsing JSON (JSON->ruby). "size" is the size of
the parser specification (stripped out insignificant spacing). "bandwidth"
is the size of the JSON / runtime. hand* and json* are manually created
(well optimized) JSON parsers. The others are from general parser
generators. *ext and *Ext have underlying C code to help (Regexp could be
considered like that also). 3 engines are shown with Grammar 0.8 along with
Grammar 0.5 (5X pure ruby improvement from 0.5->0.8). The benchmark code
along with the JSON parser specs are in the gem (benchmark dir).
Also, for any that looked at the presentation, the demos referred to are in
test/test_demo.rb.
···
On Sat, Sep 6, 2008 at 9:20 PM, Eric Mahurin <eric.mahurin@gmail.com> wrote:
Since I was giving a presentation about Grammar at the Lonestar Ruby
Conference (which ended today), I thought it was about time to release
another version of Grammar. I posted it yesterday here:
* second complete rewrite since Grammar 0.5
* BNF-like grammar written directly in Ruby
* LL(1) parser
* separation of Grammar and parser generator ("engine")
* engines could generate a parser in a particular target language or do
something else with the Grammar tree
* current engines: Ruby (generate stand-alone very flat Ruby parser), Ruby0
(directly parse while traversing Grammar), Ruby2CExt (Ruby + convert to C
using ruby2cext)
* lexer and parser grammars are specified in exactly the same way - one
parses characters and the other tokens
* can write a lexer-free parser
* may define tokens however is appropriate
* methods for a lexer (or just token) Grammar to a parser Grammar -
including multi-threading
* arbitrary lookahead can be achieved with the backtrack method
* tail-call optimization of recursion (right recursion)
* directly handles left recursion (historically only doable with (LA)LR
parsers only).
* very very fast - on par with the best hand crafted/tuned recursive
descent
LL(1) or Regexp (LL(*)) parsers. 100X faster than many other parser
generators. And that is before Ruby2CExt.
Since I was giving a presentation about Grammar at the Lonestar Ruby
Conference (which ended today), I thought it was about time to release
another version of Grammar. I posted it yesterday here:
I believe I finally have infrastructure I'm happy with that is very
extensible. At this point, it would be nice to have help with development.
Here are some of the ideas I'm thinking:
* more engines (the things that traverse Grammar trees)
++ direct Ruby C (target: on par with Florian Frank's Ragel generated JSON C
extension)
++ engines call lambda's for actions instead of inlining (slower, but more
usable/flexible)
++ generation in other languages using native (non-ruby) objects: C++, Java,
python, ...
++ generation of Graphiz dot diagram or other visualization formats
++ LALR or packrat parser generation and/or add some features of those to
current engines (i.e. memoization)
* conversion mechanisms
++ from ANTLR
++ from YACC/RACC/RACC
++ possibly layer these so that these formats could be used directly
* specific language grammars
++ *ML formats - should be simple
++ parse Ruby itself - quite difficult, but should be possible with Grammar
++ other programming languages - ANTLR grammars are probably a good start
* better documentation and examples
++ especially a web page - not my forte
* additional functionality at the Grammar level
++ external object references (not just clones or Marshaled as now)
++ additional common Grammar patterns
* many other things I'm just not seeing
Contact me with any interest.
Eric
···
On Sun, Sep 7, 2008 at 11:22 PM, ara.t.howard <ara.t.howard@gmail.com>wrote:
On Sep 6, 2008, at 8:20 PM, Eric Mahurin wrote:
Since I was giving a presentation about Grammar at the Lonestar Ruby
Conference (which ended today), I thought it was about time to release
another version of Grammar. I posted it yesterday here: