Rlex and ryacc

Hi all,

I’m just cutting my teeth on ruby, after a long time with perl as a
sysadmin. I’m currently trying to write a language compiler. Because of
my experience with perl, I am currently planning on using some kind of
lex/yacc combo, mainly because it should be pretty fast and it should be
relatively easy.

However, rlex and ryacc don’t seem to be quite functional. rlex didn’t
work with debian, although it worked apparently fine on my Solaris x86
box. ryacc, though, is using /**/ style comments and seems to have left
out the middle couple hundred lines of code, which causes a few problems.

rockit hasn’t been updated in 2 years, so I’m guessing it’s not compatible
with the latest stuff. racc seems to be exist and be current, but it
looks like a recursive descent parser (I’m not really even sure how to
tell, it just seems to have a similar grammar), and the last one I used
(Parse::RecDescent in perl) was about 10 times slower than a roughly
equivalent Parse::Lex/Parse::Yapp combo.

So, my question is: Is there a good parsing solution in ruby right now?
I really don’t want to write my own, as I don’t think I’m up to it (I’m
barely up to writing the parser with a parser compiler), but I really
would like to use ruby, as all of my prototype code is in ruby and it has
worked smashingly.

I’ve read through the ruby-talk archives, and what consensus I could find
there seemed to point to racc, so maybe I just need someone to correct my
ideas about how racc works and whether I should use it. I’m not
particularly attached to yacc-like functionality, as I’ve really only used
it once, but I am definitely concerned about speed.

Any help would be greatly appreciated.

Thanks,
Luke Kanies

···


Computers are not intelligent. They only think they are.

I have used racc successfully on a project and did not have a problem
with speed. Howver, there are ways to speed up racc when needed.
Also, you could write your parser in C (IIRC, YAML started out in racc and
went to C. Ask Why.) I am no expert, but I can probably get you going
if you choose racc.

There is also rbison, but I am not familar with the differences between
it and racc, although I think it offers roughly the same features as
racc.

And, rockit is being re-written in C and should be released now, but I
haven’t followed its progress.

···

On Friday, 7 November 2003 at 15:04:33 +0900, Luke A. Kanies wrote:

Hi all,

So, my question is: Is there a good parsing solution in ruby right now?
I really don’t want to write my own, as I don’t think I’m up to it (I’m
barely up to writing the parser with a parser compiler), but I really
would like to use ruby, as all of my prototype code is in ruby and it has
worked smashingly.

I’ve read through the ruby-talk archives, and what consensus I could find
there seemed to point to racc, so maybe I just need someone to correct my
ideas about how racc works and whether I should use it. I’m not
particularly attached to yacc-like functionality, as I’ve really only used
it once, but I am definitely concerned about speed.


Jim Freeze

Oops, missed that question. Racc is LALR(1).

···

On Friday, 7 November 2003 at 15:04:33 +0900, Luke A. Kanies wrote:

Hi all,

with the latest stuff. racc seems to be exist and be current, but it
looks like a recursive descent parser (I’m not really even sure how to
tell, it just seems to have a similar grammar), and the last one I used


Jim Freeze

Good day to let down old friends who need help.

I have used racc successfully on a project and did not have a problem
with speed. Howver, there are ways to speed up racc when needed.
Also, you could write your parser in C (IIRC, YAML started out in racc and
went to C. Ask Why.) I am no expert, but I can probably get you going
if you choose racc.

Okay, I’m trying to use that, and I can’t even get that far. What did you
use for a lexer? rlex seems to be giving me no end of problems. There’s
the inconsistency between linux and solaris, and now I’m finding that
although I get a valid Lexer.rb on solaris, it does not define the
necessary constants, so it obviously doesn’t work.

And when I rewrite my token definitions just to print and not return
anything, I get an infinite loop, because apparently the lexer isn’t
actually consuming the text or something.

There is also rbison, but I am not familar with the differences between
it and racc, although I think it offers roughly the same features as
racc.

I never even found this. If racc doesn’t work out, I’ll look at it.

And, rockit is being re-written in C and should be released now, but I
haven’t followed its progress.

The web site doesn’t have releases more recent than 2001, so I don’t think
anything’s really available.

Racc seems fine, now if I could just find a lexer.

Thanks,
Luke

···

On Fri, 7 Nov 2003, Jim Freeze wrote:


Due to circumstances beyond your control, you are master of your fate
and captain of your soul.

Okay, just to summarize…

I’m still having problems, but I have at least made progress.

Here’s what I’ve found so far:

rlex:
Does not generate code that’s valid for 1.8, but the code does run.
Does not seem to like parsing strings, only files (I get an infinite
(loop)

ryacc:
Does not generate a valid Parse.rb on any platform I’ve tested it
on, and is therefore unusable at this point.

racc:
Does not give much info about what to do for a lexer. I’ve tried
rlex, but they have incompatible means of specifying tokens,
apparently, so I don’t know how to use them together. Without
a lexer, I can’t really tell if this will work for me.

slex:
I could definitely be wrong here, but this seems to only function on
one character at a time, which is great if you needs lots of control
and have a complicated syntax, but neither case applies to me (yet).
I have no idea how to make this work for me.

So, I may have a valid parser, but I can’t seem to find a good lexer.
Am I down to writing my own (which I’d prefer not to do), or is there some
way to integrate rlex and racc? How are other people solving this
problem? I would especially like to see other people’s real-life examples
of their next_token/yyparse routines.

If I end up using rlex, I’m fully willing to modify it to generate valid,
1.8 code. I’ve already emailed the author, but have gotten no response
yet (it’s only been a couple of days).

Once again, any ideas would be greatly appreciated. And considering how
often this seems to come up on this list (from looking through the
archives) it might be a good idea to get this solved once and for all. :slight_smile:

Luke

···


“Most people are born and years later die without really having lived
at all. They play it safe and tiptoe through life with no aspiration
other than to arrive at death safely.” – Tony Campolo, “Carpe Diem”

rlex:
Does not generate code that’s valid for 1.8, but the code does run.
Does not seem to like parsing strings, only files (I get an infinite
(loop)
Had the same problems, but made a patch for rlex. It works for me,
but isn’t tested to well. And don’t forget to redefine wrap to control
wrapping behavior.

Patch for rlex attached.

HTH Tim

rlex.patch (698 Bytes)

···


“Lately, the only thing keeping me from becoming a serial killer is my distaste
for manual labor.”
– Dilbert
NP: Saints of Eden - Slow Stay (Crushed Remix)

Hi,

In mail “Re: rlex and ryacc”

I’m still having problems, but I have at least made progress.

racc:
Does not give much info about what to do for a lexer. I’ve tried
rlex, but they have incompatible means of specifying tokens,
apparently, so I don’t know how to use them together. Without
a lexer, I can’t really tell if this will work for me.

I’m using StringScanner (strscan). It is NOT a lexer generator,
but it is sufficient for me (including speed).

For real example of Racc, refer TMail or RDtool.

TMail (uses #yylex. The lexer is written in Ruby and C)
http://raa.ruby-lang.org/list.rhtml?name=tmail

RDtool (uses #do_parse and #next_token.
The lexer is written in Ruby using strscan)
http://raa.ruby-lang.org/list.rhtml?name=rdtool

Regards,
Minero Aoki

···

“Luke A. Kanies” luke@madstop.com wrote:

In mail “Re: rlex and ryacc”
I wrote:

I’m using StringScanner (strscan). It is NOT a lexer generator,
but it is sufficient for me (including speed).

StringScanner is bundled with ruby 1.8.
For ruby 1.6, check http://raa.ruby-lang.org/list.rhtml?name=strscan

– Minero Aoki

In article 20031109180247X.aamine@loveruby.net,

···

Minero Aoki aamine@loveruby.net wrote:

Hi,

In mail “Re: rlex and ryacc”
“Luke A. Kanies” luke@madstop.com wrote:

I’m still having problems, but I have at least made progress.

racc:
Does not give much info about what to do for a lexer. I’ve tried
rlex, but they have incompatible means of specifying tokens,
apparently, so I don’t know how to use them together. Without
a lexer, I can’t really tell if this will work for me.

I’m using StringScanner (strscan). It is NOT a lexer generator,
but it is sufficient for me (including speed).

For real example of Racc, refer TMail or RDtool.

TMail (uses #yylex. The lexer is written in Ruby and C)
http://raa.ruby-lang.org/list.rhtml?name=tmail

RDtool (uses #do_parse and #next_token.
The lexer is written in Ruby using strscan)
http://raa.ruby-lang.org/list.rhtml?name=rdtool

Why not bundle a lexer with racc?

Phil

In mail “Re: rlex and ryacc”

···

ptkwt@aracnet.com (Phil Tomson) wrote:

I’m using StringScanner (strscan). It is NOT a lexer generator,
but it is sufficient for me (including speed).

For real example of Racc, refer TMail or RDtool.

TMail (uses #yylex. The lexer is written in Ruby and C)
http://raa.ruby-lang.org/list.rhtml?name=tmail

RDtool (uses #do_parse and #next_token.
The lexer is written in Ruby using strscan)
http://raa.ruby-lang.org/list.rhtml?name=rdtool

Why not bundle a lexer with racc?

Because strscan comes with ruby 1.8. Maintaining same package
in other locations (for Racc and for Ruby 1.8) is undesirable
(for me :-).

Regards,
Minero Aoki