Why are parser tools rarely used in ruby?

Why is it that all the ruby source I find in the Ruby (windows) distribution
has zillions of handwritten parsers: rexml xpath, rdoc, sgml-parser ?

It the same time noone has answered the question recently posted about
whether to use racc or rbison. I have personally briefly used racc and found
it very easy to use, but can’t answer for rbison. It seems like these tools
are not in widespread use. Rockit should also be mentioned as Ruby parser
tool.

One explanation could be that it is comparatively easy to write a parser in
Ruby.

Another reason could be that using a compiler compiler requires an
additional build step and since Ruby don’t generally use a build procedure,
this is a bit annoying.

Other reasons could be about error reporting, lack of knowledge about the
available tools, or lack of tool in typical distributions.

Actually Racc does not seem to be in the 1.6.7 Windows distribution - but I
recall it being in an earlier version - a cygwin issue?

So I just wonder - why are these tools not in more widespread use?

Mikkel

“MikkelFJ” mikkelfj-anti-spam@bigfoot.com writes:

Why is it that all the ruby source I find in the Ruby (windows)
distribution has zillions of handwritten parsers: rexml xpath, rdoc,
sgml-parser ?

The RDoc history went something like this:

I wanted to do RDoc for a long time, and was waiting for a parser in
Ruby that could parse Ruby fully. There were (if I remember correctly)
three options, but none could do the full parse (and do things like
keep comments associated with the correct nodes in the parse
tree). That situation may have changed now.

At the same time, I was driven to keep RDoc self contained. I dislike
downloading a simple package only to find I need to download the
latest option parsing library (because the author preferred it over the
installed one, and even though the author used no features that
weren’t in the standard one), strscan (because those extra 10mS are
important), an XML library (ust because…) and so on.

So, I had a look at the irb Ruby parser, and it turned out to be
fairly simple to adapt. I’ve been slowly hacking it down and removing
extra features over time, but even its first incarnation worked well,
and worked in a couple of days.

So, speaking for me, I found no compelling reason to use a parser
generator for RDoc. If I were using Ruby to parse a more regular
language, I’d probably use one.

Cheers

Dave

“MikkelFJ” mikkelfj-anti-spam@bigfoot.com wrote in message

are not in widespread use. Rockit should also be mentioned as Ruby parser
tool.

I tried using Rockit but could not make it work. Posting for help on this ML
did not produce any response.;-(

BTW, I would like to know what you find out and which path you will take …
Thanks,
–shanko

I tried doing some parsing in Ruby months ago. No prior experience in
other languages.

Racc docs assumed experience with C parsers, but grokking C docs and
translating to Ruby and racc was over my head.

I got Rockit to work, understood the examples, and wrote some very
simple stuff. It felt more Ruby-wayish than what I saw in racc docs.
Unfortunately, it seems like it’s no longer developed.

I did not try rbison.

I guess that some introductory docs specifically aimed to Ruby
programmers could go a long way in spreading their use (it sure would
convince me).

Massimiliano

···

On Tue, Sep 17, 2002 at 12:31:15AM +0900, MikkelFJ wrote:

So I just wonder - why are these tools not in more widespread use?

MikkelFJ wrote:

Why is it that all the ruby source I find in the Ruby (windows) distribution
has zillions of handwritten parsers: rexml xpath, rdoc, sgml-parser ?

rdoc uses rexml.

It the same time noone has answered the question recently posted about
whether to use racc or rbison. I have personally briefly used racc and found
it very easy to use, but can’t answer for rbison. It seems like these tools
are not in widespread use. Rockit should also be mentioned as Ruby parser
tool.

I also used racc. I cannot compare it to rbison as I did not used this
one, but the way of using racc and bison are almost the same as both
recognize LALR grammars. So, at the conceptual level, there are nearly
no differences; using racc or rbison is just up to you, the harder step
is to write the grammar.

One explanation could be that it is comparatively easy to write a parser in
Ruby.

Another reason could be that using a compiler compiler requires an
additional build step and since Ruby don’t generally use a build procedure,
this is a bit annoying.

but, you can generate a complete parser with racc, so I do not think
this argument is relevant.
Then you can distribute your parser under the form of a Ruby script file.

Other reasons could be about error reporting, lack of knowledge about the
available tools, or lack of tool in typical distributions.

Actually Racc does not seem to be in the 1.6.7 Windows distribution - but I
recall it being in an earlier version - a cygwin issue?

So I just wonder - why are these tools not in more widespread use?

I think these tools need theorical studies before you can use them
efficiently.

Pierre Brengard

···

“MikkelFJ” mikkelfj-anti-spam@bigfoot.com wrote in message news:am4qlr$pg6$1@netsrv2.spss.com

So I just wonder - why are these tools not in more widespread use?

It could be that most pseudocode is valid ruby. And the language like
things you can add to ruby are amazing – [rubytalk:50138] and
[ruby-talk:26792] really impressed me.

~ Patrick

MFJ:

I don’t think Racc can compile from MS Visual Studio. I imagine this is a
barrier to including it in the Win distribution, since it is compiled from
MSVC. I may be wrong.

I’ve been actively coding a YAML parser for Ruby [http://yaml4r.sf.net/]
which uses Racc for parser. Having always adored Yacc, I found Racc a natural
counterpart. I find it to execute speedily and development itself has been swift.

I imagine the lack of Racc usage is primarily due to lack of documentation.
(Well, and the fact that it doesn’t compile on the Playskool desktop platforms.)
Sure, if you know how to use Yacc, the current docs make good sense. I think all
it would take to make Racc more successful is a couple of friendly HOWTOs. I’ve
been thinking of writing one up on the RubyGarden Wiki. I’ve been searching for
another text to assist in describing compiler compiler concepts to newbies. I
think the Happy User Guide [http://yeats.ucc.ie/~abf/happy/happy.html] could be
used as a guideline. The Bison info page could.

So, I see through your googles, too. You’d think that Racc (or a close relative)
would be included in the Ruby distributions. It could be leveraged by so many
other libraries.

_why

···

MikkelFJ (mikkelfj-anti-spam@bigfoot.com) wrote:

Actually Racc does not seem to be in the 1.6.7 Windows distribution - but I
recall it being in an earlier version - a cygwin issue?

“MikkelFJ” mikkelfj-anti-spam@bigfoot.com wrote in message

are not in widespread use. Rockit should also be mentioned as Ruby parser
tool.

I tried using Rockit but could not make it work. Posting for help on this ML
did not produce any response.;-(

I have used Rockit and it worked fine, after I installed Memoize and
I don’t remember what else.

BTW, I would like to know what you find out and which path you will take …
Thanks,
–shanko

I will not be using Rockit because it is too slow and not yet proven,
although it is a very powerful parser.

···

On Tue, Sep 17, 2002 at 02:51:45AM +0900, Shashank Date wrote:


Jim Freeze

Programming Ruby
def initialize; fun; end
A language with class

Just to put in a little plug for a friend…Bob Calco is actively
working on a c-parser for Ruby code based on Matz’s parse.y file that
detaches the Ruby source parsing from the Runtime. Its going to build
c-structures of the parsed AST and will be usable by Ruby programs (to
parse Ruby) examples of which are the FreeRIDE IDE for fast source
parsing. It will also be useful for other languages that want to parse
Ruby (through C).

-Rich

···

-----Original Message-----
From: Massimiliano Mirra [mailto:list@NOSPAMchromatic-harp.com]
Sent: Monday, September 16, 2002 9:31 PM
To: ruby-talk ML
Subject: Re: Why are parser tools rarely used in ruby?

On Tue, Sep 17, 2002 at 12:31:15AM +0900, MikkelFJ wrote:

So I just wonder - why are these tools not in more widespread use?

I tried doing some parsing in Ruby months ago. No prior
experience in other languages.

Racc docs assumed experience with C parsers, but grokking C
docs and translating to Ruby and racc was over my head.

I got Rockit to work, understood the examples, and wrote some
very simple stuff. It felt more Ruby-wayish than what I saw
in racc docs. Unfortunately, it seems like it’s no longer developed.

I did not try rbison.

I guess that some introductory docs specifically aimed to
Ruby programmers could go a long way in spreading their use
(it sure would convince me).

Massimiliano

Pierre Brengard pbrengard@bct-technology.com writes:

MikkelFJ wrote:

Why is it that all the ruby source I find in the Ruby (windows) distribution
has zillions of handwritten parsers: rexml xpath, rdoc, sgml-parser ?

rdoc uses rexml.

Does it?

“why the lucky stiff” ruby-talk@whytheluckystiff.net wrote in message
news:20020917193309.GA29004@rysa.inetz.com

Actually Racc does not seem to be in the 1.6.7 Windows distribution -
but I
recall it being in an earlier version - a cygwin issue?

I’ve been actively coding a YAML parser for Ruby [http://yaml4r.sf.net/]
which uses Racc for parser. Having always adored Yacc, I found Racc a
natural
counterpart. I find it to execute speedily and development itself has
been swift.

I wouldn’t say I adore yacc, I find the $1 etc. syntax cumbersome when you
develop the grammar and you write many rules to do simple things - in that
respect RockIt rocks. But yacc is efficient and to the point. It has been
ported to many many languages. So if you know yacc, racc is the natural for
Ruby.

I imagine the lack of Racc usage is primarily due to lack of
documentation.

I don’t see the lack of documentation as a big problem given that I found
the racc in the ruby lib directory knowing zero, looked at a readme or an
example and were able to write a small parser with no difficulties (granted
I’ve written a few parsers before). My point is that yacc is a langua franca
for parser tools. So perhaps you don’t learn about yacc syntax reading racc
documentation, but it is part of the toolbox any serious language provides.

I have also worked with ocamlyacc over the past few months, and it is
extremely efficient because it is so easy to build datastructures and
because the parser is typesafe (removing many tricky parser bugs early).
Racc isn’t typesafe, but it does allow you to easily build datastructures,
thanks to Ruby. This makes it very much easier than writing an equivalent
parser in C.

(Well, and the fact that it doesn’t compile on the Playskool desktop
platforms.)
Sure, if you know how to use Yacc, the current docs make good sense.

Exactly.

I think all
it would take to make Racc more successful is a couple of friendly HOWTOs.
I’ve
been thinking of writing one up on the RubyGarden Wiki.

If you don’t know yacc, the bison manual is quite good - so perhaps the
HOWTO should point that out.

Among parser tools it is also worth mentioning ANTLR, although it’s only C++
and Java.

Mikkel

···

MikkelFJ (mikkelfj-anti-spam@bigfoot.com) wrote:

why the lucky stiff wrote:

Actually Racc does not seem to be in the 1.6.7 Windows distribution - but I
recall it being in an earlier version - a cygwin issue?

MFJ:

I don’t think Racc can compile from MS Visual Studio. I imagine this is a
barrier to including it in the Win distribution, since it is compiled from
MSVC. I may be wrong.

yes, it can compile. You just have to fix the setup.rb script and
replace ‘make’ by ‘nmake’ as the make tool. It should compile.

Pierre Brengard

···

MikkelFJ (mikkelfj-anti-spam@bigfoot.com) wrote:

“Jim Freeze” jim@freeze.org wrote in message
news:20020916145431.A14687@freeze.org

I have used Rockit and it worked fine, after I installed Memoize and
I don’t remember what else.

I tried Rockit after installing Memoize and nothing else. And I got this :

      C:/ruby/lib/ruby/site_ruby/resultcache.rb:5: warning: already
          initialized constant BoundedLruCache

due to two conflicting class definitions:

class BoundedLruCache # in rockit/bounded_lru_cache.rb
required by rockit
class BoundedLruCache < Hash # in site_ruby/resultcache.rb required by
Memoize

Did not know to get around this problem.
Any help will be highly appreciated.

TIA,

– shanko

In article 001301c25dea$5dfbaae0$d501a8c0@TECHNO,

···

Rich Kilmer rich@infoether.com wrote:

Just to put in a little plug for a friend…Bob Calco is actively
working on a c-parser for Ruby code based on Matz’s parse.y file that
detaches the Ruby source parsing from the Runtime. Its going to build
c-structures of the parsed AST and will be usable by Ruby programs (to
parse Ruby) examples of which are the FreeRIDE IDE for fast source
parsing. It will also be useful for other languages that want to parse
Ruby (through C).

So we’ll be able to get the AST for the currently running program? Any
schedule for the initial release? Does Bob need any help? I’m thinking
this might be useful for Cardinal.

BTW: Is this similar to Ruth which I belive is also based on Matz’
parse.y file?

Phil

In article 3d87a236$0$64151$edfadb0f@dspool01.news.tele.dk,

···

MikkelFJ mikkelfj-anti-spam@bigfoot.com wrote:

“why the lucky stiff” ruby-talk@whytheluckystiff.net wrote in message
news:20020917193309.GA29004@rysa.inetz.com

MikkelFJ (mikkelfj-anti-spam@bigfoot.com) wrote:

Actually Racc does not seem to be in the 1.6.7 Windows distribution -
but I
recall it being in an earlier version - a cygwin issue?

I’ve been actively coding a YAML parser for Ruby [http://yaml4r.sf.net/]
which uses Racc for parser. Having always adored Yacc, I found Racc a
natural
counterpart. I find it to execute speedily and development itself has
been swift.

I wouldn’t say I adore yacc, I find the $1 etc. syntax cumbersome when you
develop the grammar and you write many rules to do simple things - in that
respect RockIt rocks. But yacc is efficient and to the point. It has been
ported to many many languages. So if you know yacc, racc is the natural for
Ruby.

There have been a couple of positive mentions of Rockit in this thread.
I’ve not used it, but I read the section in the Ruby Developer’s Guide on
Rockit. It seems that the consensus is that it ‘rocks’ but is perhaps a
bit slow. Is anyone still working on improving Rockit’s performance?
(Robert?)

Phil

Dave Thomas wrote:

Pierre Brengard pbrengard@bct-technology.com writes:

MikkelFJ wrote:

Why is it that all the ruby source I find in the Ruby (windows) distribution
has zillions of handwritten parsers: rexml xpath, rdoc, sgml-parser ?

rdoc uses rexml.

Does it?

errh…

in fact, no

I am really sorry, I have been confused by some other stuff and by the
fact that I first used rdoc on REXML (I discovered the functionnalities
of rdoc by generating the REXML documentation)

yes, I am ashamed

Pierre Brengard

“Pierre Brengard” pbrengard@bct-technology.com wrote in message
news:3D8825AE.7030603@bct-technology.com

I don’t think Racc can compile from MS Visual Studio. I imagine this is
a
barrier to including it in the Win distribution, since it is compiled
from
MSVC. I may be wrong.

yes, it can compile. You just have to fix the setup.rb script and
replace ‘make’ by ‘nmake’ as the make tool. It should compile.

I tested it. No problem at all - as mentioned - just replace make with nmake
and follow the instructions calling setup.rb three times using config, setup
and install as arguments.
Also rename racc to racc.rb in the bin directory so can execute racc
directly from the commandline on Windows XP.
Then “racc …/sample/calc.y” followed by “calc.tab.rb” and you got a
running calculator.

racc has an -E switch to create self contained parsers - the calc.y sample
grows from 4 to 17K of ruby code using the -E switch. That’s a pretty
tolerable overhead.

I still haven’t tested rbison, but it appears to require bison to do it’s
job which makes racc more attractive - certainly for windows users.

Mikkel

Phil et al.:

I think I can have an operational lexer/parser in about two weeks or so. It
will be a DLL you can dynamically link to and/or a LIB file that you can
statically link to in C or C++. The goal is to provide a 100% 1.7
compatible, general-purpose AST and an API for traversing it, whatever your
purpose is - interpreting, compiling to C code, compiling to byte code, etc.
It will be C compatible so you could theoretically call it from Ruby/DL for
basic parsing, obviating the need to code in C/C++ for those C-sick types
who get queasy at the mere sight of declared variables, asterisks,
ampersands and squiggley braces.

Not sure about how it will stack up to Ruth; I haven’t seen that. Got a
link? I couldn’t find it in RAA doing a brain-dead “CTRL-F” search for
"Ruth".

– Bob

%% -----Original Message-----
%% From: Phil Tomson [mailto:ptkwt@shell1.aracnet.com]
%% Sent: Tuesday, September 17, 2002 2:33 AM
%% To: ruby-talk ML
%% Subject: Re: Why are parser tools rarely used in ruby?
%%
%%
%% In article 001301c25dea$5dfbaae0$d501a8c0@TECHNO,

···

%% Rich Kilmer rich@infoether.com wrote:
%% >Just to put in a little plug for a friend…Bob Calco is actively
%% >working on a c-parser for Ruby code based on Matz’s parse.y file that
%% >detaches the Ruby source parsing from the Runtime. Its going to build
%% >c-structures of the parsed AST and will be usable by Ruby programs (to
%% >parse Ruby) examples of which are the FreeRIDE IDE for fast source
%% >parsing. It will also be useful for other languages that want to parse
%% >Ruby (through C).
%% >
%%
%% So we’ll be able to get the AST for the currently running program? Any
%% schedule for the initial release? Does Bob need any help? I’m thinking
%% this might be useful for Cardinal.
%%
%% BTW: Is this similar to Ruth which I belive is also based on Matz’
%% parse.y file?
%%
%% Phil
%%

I’m a bit ashamed to say that I’ve had almost no time for Ruby or Rockit
this year. My hope is now that things will start looking better in
January '03. Before that I’m sad to say that Rockit will probably not be
the best parser generator choice for Ruby.

The Rockit version I’ve been working on (0.4) uses a new model with
better performance characteristics. It is designed to have both a Ruby and
C backend if you really need the speed.

Hope to be back in the community next year!

Regards,

Robert Feldt

···

On Wed, 18 Sep 2002, Phil Tomson wrote:

There have been a couple of positive mentions of Rockit in this thread.
I’ve not used it, but I read the section in the Ruby Developer’s Guide on
Rockit. It seems that the consensus is that it ‘rocks’ but is perhaps a
bit slow. Is anyone still working on improving Rockit’s performance?
(Robert?)

Hi,

In mail “Re: Why are parser tools rarely used in ruby?”

yes, it can compile. You just have to fix the setup.rb script and
replace ‘make’ by ‘nmake’ as the make tool. It should compile.

I tested it. No problem at all - as mentioned - just replace make with nmake
and follow the instructions calling setup.rb three times using config, setup
and install as arguments.

Try this:

% ruby setup.rb config --make-prog=nmake

In addition, you can install racc without C compiler:

% ruby setup.rb config --without-ext

racc has an -E switch to create self contained parsers - the calc.y sample
grows from 4 to 17K of ruby code using the -E switch. That’s a pretty
tolerable overhead.

“racc -E” causes the parser including racc/parser.rb and
it’s size is 12Kbytes, so calc.rb glows too big. But if
a parser becomes big, the overhead becomes little.

–Minero Aoki

···

“MikkelFJ” mikkelfj-anti-spam@bigfoot.com wrote: