Mark Probert wrote:
My personal solution to this is to use Coco/R, an LL(1) scanner/generator.
I haven't used Coco, but I'd second Mark's recommendation to stay with
an LL(1) parser if possible. If not, then LL(n) ala ANTLR, but not LALR
ala yacc. The LL parsers are easy to write manually using recursive
descent, which I've done a few times.
I dug out my copy of "Crafting a Compiler", ISBN 0-8053-3021-4
last night, and agree that LL(1) is simpler because it can be done
from a fairly straightforward table. Yacc and Bison are LALR I
think.
[1] I find that thinking in the manner of a shift/reduce parser is
particularly unnatural to me. ... Maybe there is something I can read which will turn the problem around, so it becomes easy to handle?
Shift just means "delay a decision about what I've just seen". Reduce is
the operation you do when you do decide. If you explore the ambiguity
in your grammar rules, these start to make more sense.
That's a good description, but it's difficult to hold this
state in one's, (or is that just my?), head
The primary disadvantage of Coco/R is the LL(1) part.
ANTLR does LL(n) for arbitrary n I believe - though you should avoid
n > 3 or humans start to have trouble parsing your language :-).
It's a shame that the ANTLR folk at Purdue went Java-only when they
dropped their old C-based implementation. A multi-lingual ANTLR would
be super-cool, especially if it would generate Ruby.
There is also PCCTS
http://dynamo.ecn.purdue.edu/~hankd/PCCTS/
but, although very flexible, it is not what I'd call simple. I've
not really retained what I read about it, but I remember it is a
good design of something with sufficient complexity to meet its
goals.
As an example, Ruby can not, as far as I have tried, be converted into an LL(1) grammar, though C can.
Not without a tie-in to the lexical analyser to help recognise goto
labels, which require LL(2). Such a tie-in is commonly used however.
A simple example of the ruby grammar
Good example, thanks Mark.
I should point out that the major reason for the success of XML
(contrary to most of the hyped claims about it) is that it allows
people to create languages without having to create parsers. Or
rather, they use an XML parser which yields a DOM, and can process
the AST at will.
This is a good point. I'd not really given much thought to XML.
I'm thinking of this for describing problems expressed in
constructive solid geometry, and XML is pretty unpleasant to edit by
hand. I've not got into XSLT, but maybe there {is, could be} a utility that
could take something with Rubyesque blocks and transform it into
XML. I'm not in a position to start writng it, and, of course, I'd
have to parse the input....
"Hence, by induction..."
If you can live with the ugliness of XML and the size&speed of Rexml,
Rexml does make XML handling nice, I have to say.
you should consider it.
There's no good reason why a language like Ruby shouldn't have
grammar rules as first-class objects (as Regexp's are), yielding
Ruby objects that reflect the AST, allowing attribute-grammar
parsers to be written and integrated directly within a program.
"That's a Rite good idea, is that!"
Such a tool, integrated into the Ruby interpreter itself, would
allow extension modules to define *Ruby syntax extensions*, so
that the language itself becomes plastic.
I haven't thought much about what these last two features would
look like in Ruby's case.
Clifford Heath.
Thank you,
Hugh
···
On Thu, 27 Jan 2005, Clifford Heath wrote: