> PEGs have a number of disadvantages.
> Two of which are that it cannot handle infinite streams like socket
> protocols.
Such protocols always consist of a continuous stream of packets,
and you can parse those one by one.
Hi Clifford.
Well, I wasn't talking about packets; more protocols and things like
that. Sometimes you have to parse to figure out when to stop reading
all the elements of a single request. Sometimes this is easy (look for
a blank line a particular token), but sometimes you really do have the
parse to figure out when to stop.
That's not what happens with Treetop, which is pure PEG,
and emits more useful error messages that most hard-core
parsers. It simply records for the furthermost failure
point which terminals would have allowed it to get further.
Cheap and cheerful, but eminently comprehensible to a human.
Oh, right. Grimm/Rats! guy told me about that trick. Nice, though I'm
guessing you're going to have trouble with recovery. Bolting on the
first error is something most people don't want to happen.
> I should say that the goal of Treetop is different from most
parser generators - it's to be able to incrementally re-parse
input that's being edited with insertions and deletions, using
whatever is still valid of the memoization. Nathan's still
working on subtle bugs in the invalidation mechanism, but I
think it's an interesting project, no?
You bet. Incremental parsing with recursive descent parsers has to be
done with a mechanism strikingly similar to memoization. That's the
easy part once you figure out the trick. the part that I have not
figured out is the lexer. Since PEGs are scannerless usually, perhaps
the same mechanism just plain works; let me know how that goes. I
would be interested in the results.
> > Also, action
> execution is extremely difficult in the face of backtracking,
> particularly continuously.
That's not as much of an issue as you might think. Many
Actually, my experience is that people like ANTLR because they can put
actions anywhere. Over 24 years observing parser building, It seems
that adding arbitrary actions to the grammar is pretty common. Many
users are uncomfortable building an intermediate data structure. They
just want to do some syntax directed translation with actions in the
grammar.
stuff like symbol tables and tree construction can usually be done as
undoable actions. Though, not side effect free, right, since they
build data structures.
actions down in the leaves can be side-effect free, so
the effects get lost when that branch is abandoned.
These are typically undoable not side effect free.
More importantly, you cannot use semantic predicates if actions are
executed with side effects during the parse. You can't test if an
identifier is in the symbol table if you can't add it.
Near
the top of the tree there are very often rules that cannot
backtrack, and actions with side-effects are ok there. It
does take a little care and comprehension, but all coding
does.
Well, you should either make generic actions illegal as most do or
keep them out when you're backtracking. It's just easy to introduce
bugs by tweaking your grammar; this should not be left to the user.
> At present, Treetop offers no mechanism for runtime execution,
unless you build it into the constructors of custom SyntaxNodes.
Ah. Ok, that's good.
> Consequently, it is my opinion that pure
> PEG based parser generators are not viable commercially
I believe I've established some counter-arguments, though in
general I agree that avoidance of backtracking is preferable
if you have an effective tool for it. I'd like to find the
time to make ANTLR work properly with Ruby, but I already have
a (more) significant project under way. Maybe Eric will do it?
As before, I'm very happy to help. Minus the optional parentheses, I
like Ruby
Thanks for your detailed and well thought out counterarguments! I
forgot about the error handling thing for example.
Ter
路路路
On Sep 30, 4:56 am, Clifford Heath <n...@spam.please.net> wrote:
pa...@antlr.org wrote: